20. Index basics
What is an index?
Imagine a giant book with 10 million pages.
You want the page that talks about “Spark ETL”.
Without an index, you would:
Flip page 1
Flip page 2
Flip page 3...
…until you find it.
That’s slow.
Now imagine doing that on a database table with a million rows.
Same pain.
An index is a shortcut : like the index at the back of a book.
In books:
“Spark ETL → Page 567”
In databases:
WHERE name = "Kavana"
→ The index jumps directly to the row instead of scanning everything.
Why do we need an index?
Without an index:
Database checks EVERY row. (called full table scan)
With an index:
Database jumps to the exact row instantly.
Speed difference:
Without index: O(n) → slow as the table grows
With index: O(log n) → lightning fast
Small SQL example
Table: users
Columns: id, name, email
If we search:
Without index on email:
DB checks all 5 million rows.
With index on email
DB jumps directly to the row.
Creating the index:
What indexes actually store
An index is like a mini sorted phonebook that maps:
Think:
index = “cheat sheet”
table = “full data”
Downside
Indexes speed up searching, but slow down inserts/updates, because the index must stay sorted.
So:
Good for searching (WHERE, JOIN)
Too many indexes = slower writes
Comments
Post a Comment