Managing Row Counts Using Triggers

It seems like this is a subject that I keep coming back to. Every situation is unique, and demands it’s own solution; so far, there’s no one-size-fits all solution for working around COUNT(*) in PostgreSQL. It some situations, there is no need for a workaround, and the performance hit of COUNT(*) is neglectable (this is generally true if you have under 2000 rows, but it mainly depends on the server).

This time, the problem I ran into is how to maintain the row count for how many friends a person has. Let’s say we have 2000 users, and each user has 30 friends. That would be 60,000 rows, and out of the range of what COUNT(*) can handle. The solution here is to create a count index table, that tracks how many friends each user has. The friends table looks something like this:

CREATE TABLE friends (
    your_id BIGINT NOT NULL,
    friend_id BIGINT NOT NULL,

    CONSTRAINT friends_chk_your_id FOREIGN KEY
        (your_id) REFERENCES members (id) DEFERRABLE,
    CONSTRAINT friends_chk_friend_id FOREIGN KEY
        (friend_id) REFERENCES members (id) DEFERRABLE
);

CREATE INDEX friends_your_id_idx ON friends (your_id);
CREATE INDEX friends_friend_id_idx ON friends (friend_id);
CREATE UNIQUE INDEX friends_unq_id_idx ON friends (your_id, friend_id);

Nothing special, just your member id, and your friends member id, with a unique index to make sure there’s no duplicate entries, and some foreign keys to make sure the user id’s are valid. I want to keep track of how many friends a user has, so that would be a simple two column table:

CREATE TABLE friends_count_idx (
    num_rows INTEGER NOT NULL,
    your_id BIGINT NOT NULL,

    PRIMARY KEY(your_id),
    CONSTRAINT friends_count_idx_chkcount CHECK (num_rows >= 0)
);

CREATE INDEX friends_count_idx_num_rows ON friends_count_idx (num_rows);

Pretty simple, right? A primary key for your_id, and a check constraint to make sure you don’t go under 0 rows. Now here’s where the fun begins.

I want a function to initialize the friends_count_idx table, as well as clean it up if something ever happens and it starts to drift. So basically, we’re looking at a maintenance function that deletes any existing records, as well as inserts records based on what’s in friends:

CREATE OR REPLACE FUNCTION friends_count_idx_materialize ()
RETURNS VARCHAR AS $$
    DECLARE
        rows INTEGER;
    BEGIN
        -- Delete old records
        DELETE FROM friends_count_idx;

        INSERT INTO friends_count_idx (num_rows, your_id)
        SELECT COUNT(*), your_id FROM friends
            GROUP BY your_id;

        GET DIAGNOSTICS rows = ROW_COUNT;

        RETURN rows || ' Rows Processed';
    END;
$$ LANGUAGE plpgsql;

SELECT friends_count_idx_materialize();
--------------------
53 Rows Processed
(1 row)

SELECT * FROM friends_count_idx LIMIT 1;
num_rows | your_id
----------+---------
23 |      14
(1 row)

Now we’re getting somewhere… But I also need a set of triggers that handle inserts and deletes. They also need to be able to auto-create a record in friends_count_idx for new users, who are adding their first friend. Still, nothing too hard, but time consuming to write and debug:

CREATE OR REPLACE FUNCTION friends_count_idx_insert ()
RETURNS TRIGGER AS $$
    DECLARE
        r RECORD;
    BEGIN
        SELECT INTO r num_rows FROM friends_count_idx
            WHERE your_id = NEW.your_id;

        -- Auto Create Record
        IF NOT FOUND THEN
            INSERT INTO friends_count_idx (num_rows, your_id)
            SELECT COUNT(*), your_id FROM friends
            WHERE your_id = NEW.your_id GROUP BY your_id;

            RETURN NEW;
        END IF;

        -- Update Existing Record
        UPDATE friends_count_idx SET num_rows = (r.num_rows + 1)
            WHERE your_id = NEW.your_id;

        RETURN NEW;
    END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION friends_count_idx_delete ()
RETURNS TRIGGER AS $$
    DECLARE
        r RECORD;
        rows INTEGER;
    BEGIN
        SELECT INTO r num_rows FROM friends_count_idx
            WHERE your_id = OLD.your_id;

        -- Auto Create Record
        IF NOT FOUND THEN
            INSERT INTO friends_count_idx (num_rows, your_id)
            SELECT COUNT(*), your_id FROM friends
            WHERE your_id = OLD.your_id GROUP BY your_id;

            RETURN OLD;
        END IF;

        rows = r.num_rows - 1;

        -- Check for row count drift
        IF rows <= 0 THEN
            RETURN OLD;
        END IF;

        -- Update Existing Record
        UPDATE friends_count_idx SET num_rows = (r.num_rows - 1)
        WHERE your_id = OLD.your_id;

        RETURN OLD;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER friends_count_idx_insert AFTER INSERT ON friends
FOR EACH ROW EXECUTE PROCEDURE friends_count_idx_insert();

CREATE TRIGGER friends_count_idx_delete AFTER DELETE ON friends
FOR EACH ROW EXECUTE PROCEDURE friends_count_idx_delete();

SELECT * FROM friends_count_idx WHERE your_id = 3;

num_rows | your_id
----------+---------
3 |       3
(1 row)

INSERT INTO friends (your_id, friend_id)  VALUES (3, 7);

SELECT * FROM friends_count_idx WHERE your_id = 3;

num_rows | your_id
----------+---------
4 |       3
(1 row)
DELETE FROM friends WHERE your_id = 3 AND friend_id = 7;

SELECT * FROM friends_count_idx WHERE your_id = 3;

num_rows | your_id
----------+---------
3 |       3
(1 row)

So, it’s running smoothly. It could still probably use some more debugging, just to make sure every possible situation is covered, but the skeleton is working fine.

Leave a Reply