Merging Tables To Kill JOINs
Thursday, June 25th, 2009I was working on this project that has several large tables, and many JOINs. It had reached the point where the JOINs were becoming too costly, but fortunately, some of the tables could be merged. The problem, though, was figuring out how to get the data merged. I came up with a solution that uses views and rules… Let’s say there’s 2 tables, “main_table” and “tbl1″, and “tbl1″ needs to be merged with “main_table”:
-- Create new columns in main_table
ALTER TABLE main_table ADD COLUMN col1 VARCHAR(32);
ALTER TABLE main_table ADD COLUMN col2 VARCHAR(32);
-- Create view for tbl1
CREATE VIEW tbl1_view AS SELECT * FROM tbl1;
-- Create rule for view, tbl1_view
CREATE RULE tbl1_view_upd AS ON UPDATE TO tbl1_view DO INSTEAD (
UPDATE main_table SET col1 = NEW.col1, col2 = NEW.col2
WHERE id = NEW.refid;
);
The view comes in useful, because it doesn’t affect updates to tbl1 while you’re working. Now, if you simply do an update to the view, it will update the newly created columns in “main_table” to reflect the ones in “tbl1″. The view and old table can then be safely dropped once the data is merged.
UPDATE tbl1_view SET id = id; DROP VIEW tbl_view; DROP TABLE tbl1;
Another view with rules can then be created to represent the data that was previously in “tbl1″ and redirect writes to “main_table”, so things don’t break while backend code is being updated. My solution turned out to be a simple way to merge the tables. I would definitely consider a different approach if you’re working with millions of rows, though.
