Not a DBA

May 25, 2007

I don't consider myself an expert in database design or administration, but like many web programmers, my applications often rely on fast and efficient interaction with a database. I am fairly comfortable normalizing a schema and identifying fields to index, but I sometimes struggle creating efficient queries... especially when multiple table joins are needed.

I'm looking for suggestions on how I can improve in this area.

Dynamically Joining Tables

Currently, I'm working with a database containing multiple sets of interconnected objects. Let's assume they are television shows, actors in the shows, and times the shows are on. The goal is to allow users to apply a set of filters to find the shows that interest them. Some simple use scenarios include:

Find all shows where ActorA appears.
Find all shows on between 10AM and 12 PM.

Some harder scenarios include:

Which show or actor appears most frequently.
Which shows are on between 10AM and 12 PM, staring ActorA, but not ActorB.
Which shows are on between 10AM and 12 PM, or star ActorC.
Which shows star ActorA and ActorC, but not ActorB or ActorD.

As you can see from these examples, users can combine filter criteria (actors, times, etc) in a variety of different ways to find the information they need. My team has created a query system that relies on inner queries to search for this information. Performance is not that good. I know that sometimes the order nested queries execute can yield better performance. We're struggling to find the algorithm that will provide the right mixture.

We're also exploring other ideas like

utilizing views to try and simplify things. I'm concerned we're just pushing down the complexity to another level.
creating stored procedures to perform the filtering logic. Again, will we find performance gains by pushing the complexity down a level?

Discussion

I'm sure this problem is not unique. What techniques have others employed to find performance gains (query optimization, views, stored procedures, something else).

How do you verify that the gains are real? Do you solely look at query plans? If not, how do you keep the query cache from skewing the results?

Search This Blog

Loosely Coupled

Not a DBA

Dynamically Joining Tables

Discussion

Comments

Popular posts from this blog

I Believe...

Professionalism

Thoughts about a different StackOverflow reputation system