Back to Blog
PerformanceOctober 15, 202510 min read

Mastering Database Indexing: From B-Trees to Specialized Indexes

Everything you need to know about database indexes - when to use them, how they work, and optimization strategies.

Database indexing is the single most important performance optimization technique available to developers. A well-indexed database can handle millions of queries per second, while a poorly indexed one crawls to a halt with just thousands.

Yet, indexing is often misunderstood. Many developers add indexes randomly hoping for performance gains, while others avoid them entirely fearing write performance penalties. This guide will demystify indexing and give you the knowledge to make smart indexing decisions.

The Impact

Proper indexing can improve query performance by 100x-1000x. Poor indexing can make even simple queries take seconds instead of milliseconds.

How Indexes Work: The Fundamentals

Think of a database index like an index in a book. Instead of reading every page to find a topic, you look it up in the index and jump directly to the relevant pages.

B-Tree: The Most Common Index Type

B-Tree (Balanced Tree) indexes are the default in most databases. They maintain sorted data and allow for efficient equality, range, and prefix matching.

B-Tree Structure:

Root Node
├── Branch Node (100-200)
│   ├── Leaf Node (1-50) → Row pointers
│   ├── Leaf Node (51-100) → Row pointers
│   └── Leaf Node (101-150) → Row pointers
└── Branch Node (201-300)
    ├── Leaf Node (151-200) → Row pointers
    └── Leaf Node (201-250) → Row pointers

Operations: O(log n) for search, insert, delete. Excellent for range queries and ordered results.

Index Types: Choosing the Right Tool

1. Single-Column Indexes

The most basic type. Index one column for fast lookups.

CREATE INDEX idx_users_email ON users(email);

Best for: Primary key lookups, foreign key joins, unique constraints

2. Composite (Multi-Column) Indexes

Index multiple columns together. Order matters!

CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

Best for: WHERE clauses with multiple conditions, ORDER BY optimization

3. Partial Indexes

Index only rows that match a condition. Smaller, faster indexes.

CREATE INDEX idx_active_users ON users(email) WHERE active = true;

Best for: Frequently queried subsets (active users, recent orders)

4. Unique Indexes

Enforce uniqueness and provide fast lookups.

CREATE UNIQUE INDEX idx_unique_email ON users(email);

Best for: Natural keys, business constraints

Specialized Indexes for Specific Use Cases

Full-Text Search Indexes

For searching within text content.

-- PostgreSQL
CREATE INDEX idx_posts_content ON posts USING gin(to_tsvector('english', content));

-- MySQL
ALTER TABLE posts ADD FULLTEXT(content);

Use for: Blog posts, comments, product descriptions

Spatial (GIS) Indexes

For geographic data and location-based queries.

-- PostgreSQL
CREATE INDEX idx_places_location ON places USING gist(location);

-- MySQL
ALTER TABLE places ADD SPATIAL INDEX(location);

Use for: "Find restaurants within 5 miles", mapping applications

Hash Indexes

Fast equality lookups, but no range queries.

CREATE INDEX idx_users_email_hash ON users USING hash(email);

Use for: Simple equality lookups where range queries aren't needed

The Indexing Decision Framework

When to Create an Index

Frequent Queries:

If a column appears in WHERE, JOIN, or ORDER BY clauses regularly

High Selectivity:

Columns where queries return < 5-10% of total rows

Performance Impact:

When query execution time exceeds acceptable thresholds

Common Indexing Mistakes to Avoid

❌ Indexing Every Column

Indexes speed up reads but slow down writes. Every index has a maintenance cost.

❌ Wrong Column Order in Composite Indexes

Index order should match query patterns. (user_id, created_at) ≠ (created_at, user_id)

❌ Ignoring Index Maintenance

Indexes fragment over time. Rebuild them periodically for optimal performance.

❌ Not Monitoring Index Usage

Unused indexes waste space and slow down writes. Remove them!

Index Maintenance and Monitoring

Monitoring Queries

PostgreSQL: pg_stat_user_indexes
MySQL: performance_schema
SQL Server: sys.dm_db_index_usage_stats

Maintenance Tasks

REINDEX: Rebuild fragmented indexes
VACUUM: Clean up dead tuples (PostgreSQL)
ANALYZE: Update statistics

Advanced Indexing Strategies

Covering Indexes

Indexes that include all columns needed for a query, eliminating the need to access the table.

-- Query: SELECT name, email FROM users WHERE status = 'active'
CREATE INDEX idx_users_covering ON users(status, name, email);

-- Index contains all needed data!

Index-Only Scans

When the query can be satisfied entirely from the index without touching the table.

EXPLAIN SELECT count(*) FROM users WHERE created_at > '2025-01-01';
-- "Index Only Scan" - no table access needed!

Partial Indexes with WHERE

Index only the rows you actually query, reducing index size and maintenance.

-- Only index active users (assume 10% of users are active)
CREATE INDEX idx_active_users ON users(email) WHERE active = true;

-- Smaller index, faster queries, less maintenance

The Cost-Benefit Analysis

Index Costs vs Benefits

Costs

  • • Storage space (indexes can be 10-50% of table size)
  • • Write performance penalty (INSERT/UPDATE/DELETE slower)
  • • Maintenance overhead (rebuilds, statistics updates)
  • • Query optimizer complexity

Benefits

  • • Query performance improvements (10x-1000x faster)
  • • Reduced CPU and I/O usage
  • • Better user experience
  • • Lower infrastructure costs

Real-World Examples

E-commerce Product Search

-- Query: Products by category, sorted by price
SELECT * FROM products 
WHERE category_id = ? AND active = true 
ORDER BY price ASC LIMIT 20;

-- Optimal index:
CREATE INDEX idx_products_category_price 
ON products(category_id, price) 
WHERE active = true;

Composite index handles WHERE + ORDER BY. Partial index reduces size.

User Timeline

-- Query: Recent posts from followed users
SELECT * FROM posts 
WHERE user_id IN (followed_user_ids) 
ORDER BY created_at DESC LIMIT 50;

-- Index per user for fast timeline generation:
CREATE INDEX idx_posts_user_date 
ON posts(user_id, created_at DESC);

Composite index with descending order matches query pattern perfectly.

Tools and Best Practices

Query Analysis:

Use EXPLAIN plans to understand if indexes are being used effectively.

Index Monitoring:

Track index usage and remove unused indexes regularly.

Testing:

Load test with production-like data volumes before deploying indexes.

Documentation:

Document why each index exists and what queries it optimizes.

🎯 The Bottom Line

Indexing is both an art and a science. Start with your most important queries, measure the impact, and iterate. A few well-placed indexes can transform your application's performance.

Remember: indexes are not free. Each one has a cost. But the right indexes at the right time can be the difference between success and failure.

Further Reading

Ready to design your database?

Try Structa free—no credit card required.

Start Building