Archive for April, 2008

Comcast HD Stuff

Monday, April 28th, 2008

I’ve setup some pages for Comcast HD stuff. They are works in progress, since channel lineup and picture quality has been changing fairly frequently.

Comcast Rolls Out New HD Channels

Tuesday, April 22nd, 2008

Comcast silently rolled out three new HD channels in the Seattle/Tacoma area last night. The new channels are:

  • Disney HD
  • ABC Family HD
  • Science Channel HD

Unfortunately, Comcast has started re-encoding HD channels to squeeze three channels to a single QAM, and picture quality is very noticeably degraded on several channels, as I’ve reported before. Channels with poor picture/sound include:

  • Discovery Channel HD
  • USA HD
  • SciFi HD
  • Universal HD (only for certain content)
  • MHD (only for certain content)

PostgreSQL: Ultra Simple COUNT(*) Workaround

Monday, April 21st, 2008

Just a little something I whipped up. It’s only for getting the total number of rows in table X. If you have under 10,000 rows, the performance hit of COUNT(*) is neglectable and you probably don’t need this.

Example:

SELECT __count_add('my_table');
SELECT rows FROM __count_tables WHERE tablename = 'my_table';

Ultra Simple COUNT(*) Workaround

Indexing Functions With PostgreSQL

Tuesday, April 15th, 2008

In PostgreSQL, it’s possible to create expressional indexes. In other words, indexes such as:

CREATE INDEX varchar_column_idx ON bar (UPPER(varchar_column));

It is possible to do this on any function that is marked as IMMUTABLE. IMMUTABLE just means that the function will always return the same result when given the same input. This is very powerful, as you can create indexes on user created functions, which can dramatically improve performance.

For instance, lets say I need to group some data by week, with every week starting on Sunday. I could do:


CREATE OR REPLACE FUNCTION find_previous_sunday
(TIMESTAMP) RETURNS DATE AS $$
    BEGIN
        RETURN
        ($1 + (-extract(dow from $1) || ' days')::interval)::date;
    END;
$$ LANGUAGE plpgsql IMMUTABLE;

Note, that you can always do extract(week from timestamp), but there isn’t much flexibility there, and clients are often picky about when they want their weeks to start, plus this is only an example. Onwards to the benchmark:


---
--- Test without indexing
---
EXPLAIN ANALYZE SELECT
    find_previous_sunday(entry_time) AS week_start,
    SUM(distance)
    FROM tbl
    WHERE user_id = 1
    GROUP BY week_start
    ORDER BY week_start DESC limit 1;

    QUERY PLAN
-------------------------------
 Limit  (... loops=1)
   ->  GroupAggregate  (...)(actual time=1.561..1.561...)
         ->  Sort  (...) (actual time=1.556..1.556 rows=2...)
               Sort Key: find_previous_sunday(entry_time)
               ->  Index Scan using tbl_user_id on tbl
                        (...) (actual time=0.060..1.388 rows=128...)
                     Index Cond: (user_id = 1)
 Total runtime: 1.651 ms
(7 rows)

---
--- Index find_previous_sunday function, and run test again.
---
CREATE INDEX tbl_week_start ON tbl
    (find_previous_sunday(entry_time));

VACUUM FULL ANALYZE;

EXPLAIN ANALYZE SELECT
    find_previous_sunday(entry_time) AS week_start,
    SUM(distance)
    FROM tbl
    WHERE user_id = 1
    GROUP BY week_start
    ORDER BY week_start DESC limit 1;

       QUERY PLAN
-------------------------------
 Limit  (...) (actual time=0.062..0.062 rows=1 loops=1)
   ->  GroupAggregate  (...) (actual time=0.062..0.062...)
         ->  Index Scan Backward using tbl_week_start on tbl
                  (...) (actual time=0.042..0.057 rows=2...)
               Filter: (user_id = 1)
 Total runtime: 0.096 ms
(5 rows)

1.651ms without indexing, 0.096ms with. Note that there’s only about 200 rows in the test table, and performance without indexing scales very poorly as the number of rows increases.

One Of PostgreSQL’s Biggest Weaknesses

Sunday, April 13th, 2008

One of PostgreSQL’s biggest weaknesses is easy replication and clustering. Note, that I said easy. There’s absolutely replication solutions for PostgreSQL, but many of them have serious downfalls, and are a serious pain in the ass to setup.

Slony-I is a decent single-master multi-slave solution, if you don’t need LOBs replicated. It’s a little hard to setup, and you have to learn it’s special syntax for the config files you will have to create. A serious pitfall is it is ascynchronous, and therefore doesn’t guarantee all nodes will have the same information at any given time. Writes don’t scale very well. Also, if the master server goes down, you’re pretty much screwed unless you have a really cleaver way of failing over, since:

  • Sequence replication isn’t always reliable.
  • Triggers are disabled on all nodes, except for the master, to make sure they’re not called twice.
  • Incomplete failover results in errors being thrown on all nodes, including the failover master, with writes.

When using Slony-I, you need something like pgpool-II so you can transparently load balance nodes and redirect writes to the master server.

pgpool-II also has a multi-master replication mode, however, it absolutely sucks. The reason? It’s statement based, so instead of doing an insert and copying the result to all other nodes, it runs the same statement on all nodes, resulting in drift on all fields that use default values.

PgCluster seems to fix just about all of the problems with Slony-I and pgpool-II, by providing synchronous multi-master replication, however:

  • It’s basically a patchset to PostgreSQL, so it tends to run a release behind.
  • It’s a major, major, MAJOR, pain in the ass to setup.

Mozilla Projects - Getting On My Nerves

Thursday, April 10th, 2008

I’m growing sick of Mozilla projects. The problem I have with them? Bugs, bugs, bugs, bugs, bugs.

Just for instance, in Mozilla Thunderbird there’s this bug that has been around for years and is driving me mad. After it’s been running for a while, it suddenly starts sucking up 100% of my CPU, and I have to restart it. It has been reported by many users, and it still doesn’t seem to have been fixed.

Also, with Firefox, after it’s been running for a while (maybe a couple days or so) it starts slowing down. And I mean really slowing down. Sometimes it takes ten seconds for it to respond to a single click. Other times, I have to kill it because it won’t respond. What’s worse, is it also crashes completely on some sites; I’ve run into this many times on del.icio.us. Firefox 3 beta is even worse!

USA HD: Picture Sucks

Monday, April 7th, 2008

I’m not sure exactly what is causing this, but the picture on USA HD absolutely sucks. Even with shows like Monk or House, with little to no action sequences, there’s severe macro-blocking, artifacts, video/audio corruption, audio cutouts, and frame-dropping. This really seems like crappy/overloaded hardware to me (or possibly a bandwidth issue), rather than a re-compression problem. Normally if there’s a re-compression/compression issue, you will notice macro blocking, but not audio/video cutouts or corruption.

Here’s some examples from the movie Final Destination 2 (scaled to ~1/3 the actual size).

In this first frame, it just looks like macro-blocking.

Final Destination 2: Frame 1

However, in the following frames, there’s clearly corruption going on. Also, note that the audio cut out for about a second shortly after this part.

Final Destination 2: Frame 2

Final Destination 2: Frame 3

Some possibilities are:

  1. The hardware Comcast is using for USA and Sci-Fi is either overloaded, or really crappy. I’ve heard Comcast has been using Imagine Communication hardware to triple-load three channels on to a single QAM frequency, and there’s been some issues with that setup.
  2. USA’s encoding hardware is really crappy. That would make sense, since both USA and Sci-Fi are owned by the same company, and both have horrible picture.

Conditional UPDATE And DELETE In PostgreSQL

Friday, April 4th, 2008

One of the coolest PostgreSQL features: Conditional SELECTs/UPDATEs/DELETEs.

Lets say we have a table of distances, but we need to add a column for the distance type, ie, Miles or Kilometers. Conditional UPDATE to the rescue!!!:

UPDATE distances SET measurement =
    (CASE WHEN (name ~* '^[0-9]+K|km.$|kilometers$')
        THEN 'kilometers'
    WHEN (name ~* '^[0-9]+M|mi.$|miles$')
        THEN 'miles'
    ELSE 'unknown' END);

They also work for DELETEs, but I haven’t found any real world situation where it makes a lot of sense. Regardless:

DELETE FROM table WHERE
    (CASE WHEN col1 ~* '^foo|bar$'
        THEN 't'::boolean
    WHEN col2 ~* '^oof|rab$'
        THEN 't'::boolean
    ELSE 'f'::boolean END);