Cardinality refers to a number of unique values in a row.
You can find the cardinally of any column using the SQL keyword DISTINCT.
Example:
Cardinality: select count(distinct (deptno)) from employee;
The above SQL statement displays the number of department in
a company or in other words the number of unique value that appears on a deptno
column of an employee table.
Selectivity is
the ratio of cardinality to the number of records of an Indexed column.
Selectivity = (Distinct Values/Total number of records)
If the ratio or selectivity is 1, it means all the records
in a row are unique and is also referred to as high selectivity. An Indexed
column with high selectivity is the most optimized column and is normally faster for select from a table. Column with Primary Key or Unique constraints have a selectivity of 1.
Employee Table column “gender” stores gender information for
20,000 employees of our company. This column stores just two unique records (M/F).
This is a solid example of low selectivity column. Column that stores Boolean records like Y/N, 1/0, T/F is another example of low selectivity. Whenever, there is low or bad selectivity, the
optimizer utilizes Full Table Scan (FTS) to fetch data. FTS is bad
especially with large table causing a lot of overhead to the database engine.
Selectivity = (total unique count/total records)
= (2/20,000)
= 0.0001
= (2/20,000)
= 0.0001
As a database developer or dba, you cannot modify company’s
data to raise the selectivity for a performance. How do you improve selectivity
for a column with low selectivity for performance on such case? Adding a
composite Index is the best way to make a low selectivity high. How does it
improve the selectivity? Using more than one column makes the Index more unique
which in turns improves Index selectivity.
How to measure Index Selectivity in Oracle?
Syntax:- SELECT (distinct_keys / num_rows) AS Selectivity FROM dba_indexes WHERE index_name like 'Index_Name' and Owner ='Table_Owner';
Example: SELECT (distinct_keys / num_rows) AS Selectivity FROM dba_indexes WHERE index_name like 'PK_EMP' and Owner ='BANIYA';
Using our standard employee table, we are going to demonstrate you
how to improve the selectivity? What is the selectivity of deptno column on a
table? The company has 3 departments and 14 employees. The selectivity is for
deptno is around 0.22 and it considered low selectivity. As stated above,
creating a composite Index makes the data unique and improves the selectivity. Let’s
create composite Index and measure the selectivity again?
--Composite Index CREATE INDEX composite_index ON baniya.employee( deptno,sal, comm);
Selectivity: SELECT (distinct_keys / num_rows) AS Selectivity, distinct_keys, num_rows FROM dba_indexes WHERE index_name like 'COMPOSITE_INDEX' and Owner ='BANIYA';
Output:
The selectivity has now changed from 0.22 to nearly 1 or
almost 1. We have just increased the
selectivity and the performance of a query where deptno column is used.
The difference between these two keywords are clear.The term selectivity
is not the same as cardinality. These words are often used interchangeably by
those who do not know the difference. Now, you know the difference, you are
always going to pick the right keyword going forward.
Interested in working with me? I can be reached at pbaniya04[at]gmail.com for any questions, consulting opportunities or you may drop a line to say HELLO. Thank your again for visiting my blog and looking forward to serving you more.
Have a Database-ious Day!
Have a Database-ious Day!
Awesome Explanation
ReplyDeleteok, but how to improve the scalability of index, e.g. index has scalability 0.00002761500267865525982956020346733973633195 how to improve it?
ReplyDelete