I'm analyzing my queries performance using New Relic and this one in particular is taking a long time to complete:
SELECT "events".*
FROM "events"
WHERE ("events"."deleted_at" IS NULL AND
"events"."eventable_id" = $? AND
"events"."eventable_type" = $? OR
"events"."deleted_at" IS NULL AND
"events"."eventable_id" IN (SELECT "flow_recipients"."id" FROM "flow_recipients" WHERE "flow_recipients"."contact_id" = $?) AND "events"."eventable_type" = $?)
ORDER BY "events"."created_at" DESC
LIMIT $? OFFSET $?
Sometimes this query takes more than 8 seconds to be completed, and I can't understand why. I have taken a look at the query explain, but I'm not sure I can understand it:
Is there something wrong with my indexes? Is there something I can optimize? How could I further investigate what's going on?
I suspect that the fact that I'm using SELECT events.* instead of selecting only the columns I'm interested could have some impact, but I'm using a LIMIT of 15, so I'm not sure it would impact that much.
[EDIT]
I have an index on created_at column and another index on eventable_id and eventable_type columns. Apparently, this second index is not being used, and I don't know why.
The cause of the long execution time is
that the optimizer hopes it can find enough matching rows quickly by scanning all rows in the sorting order and picking out those that match the condition, but the executor has to scan 630835 rows until it finds enough matching rows.
For every row that is being examined, the subselect is executed.
You should rewrite that OR to a UNION:
SELECT * FROM events
WHERE deleted_at IS NULL
AND eventable_id = $?
AND eventable_type = $?
UNION
SELECT * FROM events e
WHERE deleted_at IS NULL
AND eventable_type = $?
AND EXISTS (SELECT 1
FROM flow_recipients f
WHERE f.id = e.eventable_id
AND f.contact_id = $?);
This query does the same thing if events has a primary key.
Useful indexes depend on the execution plan chosen, but these ones might be good:
CREATE INDEX ON events (eventable_type, eventable_id)
WHERE deleted_at IS NULL;
CREATE INDEX ON flow_recipients (contact_id);
Related
I use a ROW_NUMBER() function inside a CTE that causes a SORT operator in the query plan. This SORT operator has always been the most expensive element of the query, but has recently spiked in cost after I increased the number of columns read from the CTE/query.
What confuses me is the increase in cost is not proportional to the column count. I can increase the column count without much issue, normally. However, it seems my query has past some threshold and now costs so much it has doubled the query execution time from 1hour to 2hour+.
I can't figure out what has caused the spike in cost and it's having an impact on business. Any ideas or next steps for troubleshooting you can advise?
Here is the query (simplified):
WITH versioned_events AS (
SELECT [event].*
,CASE WHEN [event].[handle_space] IS NOT NULL THEN [inv].[involvement_id]
ELSE [event].[involvement_id]
END AS [derived_involvement_id]
,ROW_NUMBER() OVER (PARTITION BY [event_id], [event_version] ORDER BY [event_created_date] DESC, [timestamp] DESC ) AS [latest_version]
FROM [database].[schema].[event_table] [event]
LEFT JOIN [database].[schema].[involvement] as [inv]
ON [event].[service_delivery_id] = [inv].[service_delivery_id]
AND [inv].[role_type_code] = 't'
AND [inv].latest_involvement = 1
WHERE event.deletion_type IS NULL AND (event.handle_space IS NULL
OR (event.handle_space NOT LIKE 'x%'
AND event.handle_space NOT LIKE 'y%'))
)
INSERT INTO db.schema.table (
....
)
SELECT
....
FROM versioned_events AS [event]
INNER JOIN (
SELECT DISTINCT service_delivery_id, derived_involvement_id
FROM versioned_events
WHERE latest_version = 1
WHERE ([versioned_events].[timestamp] > '2022-02-07 14:18:09.777610 +00:00')
) AS [delta_events]
ON COALESCE([event].[service_delivery_id],'NULL') = COALESCE([delta_events].[service_delivery_id],'NULL')
AND COALESCE([event].[derived_involvement_id],'NULL') = COALESCE([delta_events].[derived_involvement_id],'NULL')
WHERE [event].[latest_version] = 1
Here is the query plan from the version with the most columns that experiences the cost spike (all others look the same except this operator takes much less time (40-50mins):
I did a comparison of three executions, each with different column counts in the INSERT INTO SELECT FROM clause. I can't share the spreadsheet, but I will try convey my findings so far. The following is true of the query with the most columns:
It takes more than twice as long to execute than the other two executions
It performs more logical & physical reads and scans
It has more CPU time
It reads the most from Tempdb
The increase in execution time is not proportional with the increase in reads or other mentioned metrics
It is true that there is a memory spill level 8 happening. I have tried updating statistics, but it didn't help and all the versions of the query suffer the same problem so like-for-like is still compared.
I know it can be hard to help with this kind of problem without being able to poke around but I would be grateful if anyone could point me in the direction for what to check / try next.
P.S. the table it reads from is a heap and the table it joins to is indexed. The heap table needs to be a heap otherwise inserts into it will take too long and the problem is kicked down the road.
Also, when I say added more columns, I mean in the SELECT FROM versioned_events statement. The columns are replaced with "...." in the above example.
UPDATE
Using a temp table halved the execution time when the column count is the high number that caused the issue but actually takes longer with a reduced column count. It goes back to the idea that a threshold is crossed when the column count is increased :(. In any event, we've used a temp table for now to see if it helps in production.
I am new to this site, but please don't hold it against me. I have only used it once.
Here is my dilemma: I have moderate SQL knowledge but am no expert. The query below was created by a consultant a long time ago.
On most mornings it takes a 1.5 hours to run because there is lots of data. BUT other mornings, it takes 4-6 hours. I have tried eliminating any jobs that are running. I am thoroughly confused as to what to try to find out what is causing this problem.
Any help would be appreciated.
I have already broken this query into 2 queries, but any tips on ways to help boost performance would be greatly appreciated.
This query builds back our inventory transactions to find what our stock on hand value was at any given point in time.
SELECT
ITCO, ITIM, ITLOT, Time, ITWH, Qty, ITITCD,ITIREF,
SellPrice, SellCost,
case
when Transaction_Cost is null
then Qty * (SELECT ITIACT
FROM (Select Top 1 B.ITITDJ, B.ITIREF, B.ITIACT
From OMCXIT00 AS B
Where A.ITCO = B.ITCO
AND A.ITWH = B.ITWH
AND A.ITIM = B.ITIM
AND A.ITLOT = B.ITLOT
AND ((A.ITITDJ > B.ITITDJ)
OR (A.ITITDJ = B.ITITDJ AND A.ITIREF <= B.ITIREF))
ORDER BY B.ITITDJ DESC, B.ITIREF DESC) as C)
else Transaction_Cost
END AS Transaction_Cost,
case when ITITCD = 'S' then ' Shipped - Stock' else null end as TypeofSale,
case when ititcd = 'S' then ITIREF else null end as OrderNumber
FROM
dbo.InvTransTable2 AS A
Here is the execution plan.
http://i.imgur.com/mP0Cu.png
Here is the DTA but I am unsure how to read it since the recommedations are blank. Shouldn't that say "Create"?
http://i.imgur.com/4ycIP.png
You can not do match with dbo.InvTransTable2, because of you are selected all records from it, so it will be left scanning records.
Make sure that you have clustered index on OMCXIT00, it looks like it is a heap, no clustered index.
Make sure that clustered index is small, but has more distinct values in it.
If you have not many records OMCXIT00, it may be sufficient to create index with key ITCO and include following columns in include ( ITITDJ , ITIREF, ITWH,ITCO ,ITIM,ITLOT )
Index creation example:
CREATE INDEX IX_dbo_OMCXIT00
ON OMCXIT00 ([ITCO])
INCLUDE ( ITITDJ , ITIREF)
If it does not help, then you need to see which columns in the predicates that you are searching for has more distinct values, and create index with key one or some of them and make sure reorder predicate order in where clause.
A.ITCO = B.ITCO
AND A.ITWH = B.ITWH
AND A.ITIM = B.ITIM
AND A.ITLOT = B.ITLOT
besides adding indexes to change table scans for index seeks, ask to yourself: "do i really need this order by in this sql code?". if you dont neet this sorting, remove order by from your sql code. next, there is a good chance your code will be faster.
I ran across a problem with a SQL statement today that I was able to fix by adding additional criteria, however I really want to know why my change fixed the problem.
The problem query:
SELECT *
FROM
(SELECT ah.*,
com.location,
ha.customer_number,
d.name applicance_NAME,
house.name house_NAME,
dr.name RULE_NAME
FROM actionhistory ah
INNER JOIN community com
ON (t.city_id = com.city_id)
INNER JOIN house_address ha
ON (t.applicance_id = ha.applicance_id
AND ha.status_cd = 'ACTIVE')
INNER JOIN applicance d
ON (t.applicance_id = d.applicance_id)
INNER JOIN house house
ON (house.house_id = t.house_id)
LEFT JOIN the_rule tr
ON (tr.the_rule_id = t.the_rule_id)
WHERE actionhistory_id >= 'ACT100010000'
ORDER BY actionhistory_id
)
WHERE rownum <= 30000;
The "fix"
SELECT *
FROM
(SELECT ah.*,
com.location,
ha.customer_number,
d.name applicance_NAME,
house.name house_NAME,
dr.name RULE_NAME
FROM actionhistory ah
INNER JOIN community com
ON (t.city_id = com.city_id)
INNER JOIN house_address ha
ON (t.applicance_id = ha.applicance_id
AND ha.status_cd = 'ACTIVE')
INNER JOIN applicance d
ON (t.applicance_id = d.applicance_id)
INNER JOIN house house
ON (house.house_id = t.house_id)
LEFT JOIN the_rule tr
ON (tr.the_rule_id = t.the_rule_id)
WHERE actionhistory_id >= 'ACT100010000' and actionhistory_id <= 'ACT100030000'
ORDER BY actionhistory_id
)
All of the _id columns are indexed sequences.
The first query's explain plan had a cost of 372 and the second was 14. This is running on an Oracle 11g database.
Additionally, if actionhistory_id in the where clause is anything less than ACT100000000, the original query returns instantly.
This is because of the index on the actionhistory_id column.
During the first query Oracle has to return all the index blocks containing indexes for records that come after 'ACT100010000', then it has to match the index to the table to get all the records, and then it pulls 29999 records from the result set.
During the second query Oracle only has to return the index blocks containing records between 'ACT100010000' and 'ACT100030000'. Then it grabs from the table those records that are represented in the index blocks. A lot less work in that step of grabbing the record after having found the index than if you use the first query.
Noticing your last line about if the id is less than ACT100000000 - sounds to me that those records may all be in the same memory block (or in a contiguous set of blocks).
EDIT: Please also consider what is said by Justin - I was talking about actual performance, but he is pointing out that the id being a varchar greatly increases the potential values (as opposed to a number) and that the estimated plan may reflect a greater time than reality because the optimizer doesn't know the full range until execution. To further optimize, taking his point into consideration, you could put a function based index on the id column or you could make it a combination key, with the varchar portion in one column and the numeric portion in another.
What are the plans for both queries?
Are the statistics on your tables up to date?
Do the two queries return the same set of rows? It's not obvious that they do but perhaps ACT100030000 is the largest actionhistory_id in the system. It's also a bit confusing because the first query has a predicate on actionhistory_id with a value of TRA100010000 which is very different than the ACT value in the second query. I'm guessing that is a typo?
Are you measuring the time required to fetch the first row? Or the time required to fetch the last row? What are those elapsed times?
My guess without that information is that the fact that you appear to be using the wrong data type for your actionhistory_id column is affecting the Oracle optimizer's ability to generate appropriate cardinality estimates which is likely causing the optimizer to underestimate the selectivity of your predicates and to generate poorly performing plans. A human may be able to guess that actionhistory_id is a string that starts with ACT10000 and then has 30,000 sequential numeric values from 00001 to 30000 but the optimizer is not that smart. It sees a 13 character string and isn't able to figure out that the last 10 characters are always going to be numbers so there are only 10 possible values rather than 256 (assuming 8-bit characters) and that the first 8 characters are always going to be the same constant value. If, on the other hand, actionhistory_id was defined as a NUMBER and had values between 1 and 30000, it would be dramatically easier for the optimizer to make reasonable estimates about the selectivity of various predicates.
I'm working with a non-profit that is mapping out solar potential in the US. Needless to say, we have a ridiculously large PostgreSQL 9 database. Running a query like the one shown below is speedy until the order by line is uncommented, in which case the same query takes forever to run (185 ms without sorting compared to 25 minutes with). What steps should be taken to ensure this and other queries run in a more manageable and reasonable amount of time?
select A.s_oid, A.s_id, A.area_acre, A.power_peak, A.nearby_city, A.solar_total
from global_site A cross join na_utility_line B
where (A.power_peak between 1.0 AND 100.0)
and A.area_acre >= 500
and A.solar_avg >= 5.0
AND A.pc_num <= 1000
and (A.fips_level1 = '06' AND A.fips_country = 'US' AND A.fips_level2 = '025')
and B.volt_mn_kv >= 69
and B.fips_code like '%US06%'
and B.status = 'active'
and ST_within(ST_Centroid(A.wkb_geometry), ST_Buffer((B.wkb_geometry), 1000))
--order by A.area_acre
offset 0 limit 11;
The sort is not the problem - in fact the CPU and memory cost of the sort is close to zero since Postgres has Top-N sort where the result set is scanned while keeping up to date a small sort buffer holding only the Top-N rows.
select count(*) from (1 million row table) -- 0.17 s
select * from (1 million row table) order by x limit 10; -- 0.18 s
select * from (1 million row table) order by x; -- 1.80 s
So you see the Top-10 sorting only adds 10 ms to a dumb fast count(*) versus a lot longer for a real sort. That's a very neat feature, I use it a lot.
OK now without EXPLAIN ANALYZE it's impossible to be sure, but my feeling is that the real problem is the cross join. Basically you're filtering the rows in both tables using :
where (A.power_peak between 1.0 AND 100.0)
and A.area_acre >= 500
and A.solar_avg >= 5.0
AND A.pc_num <= 1000
and (A.fips_level1 = '06' AND A.fips_country = 'US' AND A.fips_level2 = '025')
and B.volt_mn_kv >= 69
and B.fips_code like '%US06%'
and B.status = 'active'
OK. I don't know how many rows are selected in both tables (only EXPLAIN ANALYZE would tell), but it's probably significant. Knowing those numbers would help.
Then we got the worst case CROSS JOIN condition ever :
and ST_within(ST_Centroid(A.wkb_geometry), ST_Buffer((B.wkb_geometry), 1000))
This means all rows of A are matched against all rows of B (so, this expression is going to be evaluated a large number of times), using a bunch of pretty complex, slow, and cpu-intensive functions.
Of course it's horribly slow !
When you remove the ORDER BY, postgres just comes up (by chance ?) with a bunch of matching rows right at the start, outputs those, and stops since the LIMIT is reached.
Here's a little example :
Tables a and b are identical and contain 1000 rows, and a column of type BOX.
select * from a cross join b where (a.b && b.b) --- 0.28 s
Here 1000000 box overlap (operator &&) tests are completed in 0.28s. The test data set is generated so that the result set contains only 1000 rows.
create index a_b on a using gist(b);
create index b_b on a using gist(b);
select * from a cross join b where (a.b && b.b) --- 0.01 s
Here the index is used to optimize the cross join, and speed is ridiculous.
You need to optimize that geometry matching.
add columns which will cache :
ST_Centroid(A.wkb_geometry)
ST_Buffer((B.wkb_geometry), 1000)
There is NO POINT in recomputing those slow functions a million times during your CROSS JOIN, so store the results in a column. Use a trigger to keep them up to date.
add columns of type BOX which will cache :
Bounding Box of ST_Centroid(A.wkb_geometry)
Bounding Box of ST_Buffer((B.wkb_geometry), 1000)
add gist indexes on the BOXes
add a Box overlap test (using the && operator) which will use the index
keep your ST_Within which will act as a final filter on the rows that pass
Maybe you can just index the ST_Centroid and ST_Buffer columns... and use an (indexed) "contains" operator, see here :
http://www.postgresql.org/docs/8.2/static/functions-geometry.html
I would suggest creating an index on area_acre. You may want to take a look at the following: http://www.postgresql.org/docs/9.0/static/sql-createindex.html
I would recommend doing this sort of thing off of peak hours though because this can be somewhat intensive with a large amount of data. One thing you will have to look at as well with indexes is rebuilding them on a schedule to ensure performance over time. Again this schedule should be outside of peak hours.
You may want to take a look at this article from a fellow SO'er and his experience with database slowdowns over time with indexes: Why does PostgresQL query performance drop over time, but restored when rebuilding index
If the A.area_acre field is not indexed that may slow it down. You can run the query with EXPLAIN to see what it is doing during execution.
First off I would look at creating indexes , ensure your db is being vacuumed, increase the shared buffers for your db install, work_mem settings.
First thing to look at is whether you have an index on the field you're ordering by. If not, adding one will dramatically improve performance. I don't know postgresql that well but something similar to:
CREATE INDEX area_acre ON global_site(area_acre)
As noted in other replies, the indexing process is intensive when working with a large data set, so do this during off-peak.
I am not familiar with the PostgreSQL optimizations, but it sounds like what is happening when the query is run with the ORDER BY clause is that the entire result set is created, then it is sorted, and then the top 11 rows are taken from that sorted result. Without the ORDER BY, the query engine can just generate the first 11 rows in whatever order it pleases and then it's done.
Having an index on the area_acre field very possibly may not help for the sorting (ORDER BY) depending on how the result set is built. It could, in theory, be used to generate the result set by traversing the global_site table using an index on area_acre; in that case, the results would be generated in the desired order (and it could stop after generating 11 rows in the result). If it does not generate the results in that order (and it seems like it may not be), then that index will not help in sorting the results.
One thing you might try is to remove the "CROSS JOIN" from the query. I doubt that this will make a difference, but it's worth a test. Because a WHERE clause is involved joining the two tables (via ST_WITHIN), I believe the result is the same as an inner join. It is possible that the use of the CROSS JOIN syntax is causing the optimizer to make an undesirable choice.
Otherwise (aside from making sure indexes exist for fields that are being filtered), you could play a bit of a guessing game with the query. One condition that stands out is the area_acre >= 500. This means that the query engine is considering all rows that meet that condition. But then only the first 11 rows are taken. You could try changing it to area_acre >= 500 and area_acre <= somevalue. The somevalue is the guessing part that would need adjustment to make sure you get at least 11 rows. This, however, seems like a pretty cheesy thing to do, so I mention it with some reticence.
Have you considered creating Expression based indexes for the benefit of the hairier joins and where conditions?
[Warning: long post ahead!]
I've banging my head at this for quite some time now but can't get on a common denominator what is going on. I've found a solution to workaround, see at the end, but my inner Zen is not satisfied yet.
I've a main table with forum messages (it's from Phorum), simplified looks like this (ignore the anon_user_id for the moment, I will get later to it):
CREATE TABLE `test_msg` (
`message_id` int(10) unsigned NOT NULL auto_increment,
`status` tinyint(4) NOT NULL default '2',
`user_id` int(10) unsigned NOT NULL default '0',
`datestamp` int(10) unsigned NOT NULL default '0',
`anon_user_id` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`message_id`)
);
Messages can be anonymized by the software, in which case the user_id is set to 0. The software also allows posting complete anonymous messages which we endorse. In our case we need to still know which user posted a message, so through the hook system provided by Phorum we have a second table we update accordingly:
CREATE TABLE `test_anon` (
`message_id` bigint(20) unsigned NOT NULL,
`user_id` bigint(20) unsigned NOT NULL,
KEY `fk_user_id` (`user_id`),
KEY `fk_message_id` (`message_id`)
);
For the view in the profile, I need to get the a list of messages from a user, no matter whether they've have been anonmized by him or not.
A user itself has always the right to see the message he wrote anonymously or later anonymized.
Because user_id gets set to 0 if anonymized, we can't simply use WHERE for it; we need to join with our own second table. Formulate the above into SQL looks like this (the status = 2 is required, other states would mean the post is hidden, pending approval, etc.):
SELECT * FROM test_msg AS m
LEFT JOIN test_anon ON test_anon.message_id = m.message_id
WHERE (test_anon.user_id = 20 OR m.user_id = 20)
AND m.status = 2
ORDER BY m.datestamp DESC
LIMIT 0,10
This query by itself, whenever the query cache is empty, takes a few second, something along 4 seconds currently. Things get worse when multiple users issue the query and the query cache is empty (which just happens; people post messages and the cached queries are invalid); we faced in our internal testing phase and reports were that the system sometimes slows down. We've seen queries taking 30 to 60 seconds because of the concurrent-ness. I don't want to start imagine what happens when we expand our user base ...
Now it's not like I didn't to any analysis about the bottleneck.
I tried rewriting the WHERE clause, adding indice and deleting them like hell.
This is when I found out that when I do not use any index, the query performs lighting fast under certain conditions. Using no index, the query looks like:
SELECT * FROM test_msg AS m USE INDEX()
LEFT JOIN test_anon ON test_anon.message_id = m.message_id
WHERE (test_anon.user_id = 20 OR m.user_id = 20)
AND m.status = 2
ORDER BY m.datestamp DESC
LIMIT 0,10
Now here comes the certain condition: the LIMIT limits the result to 10 rows. Assume my complete result n = 26. Using a LIMIT 0,10 to LIMIT 16,0 takes zero seconds (something along < 0.01s): these are the cases were the result is always 10 rows.
Starting with LIMIT 17,10 , the result will be only 9 rows. Starting at this point, the query starts taking around four seconds again. The is applicable for all results where the result set is smaller then the number of maximum rows limited through LIMIT. Irritating!
Going back to the first CREATE TABLE statement, I also conducted tests without the LEFT JOIN; we just assume user_id=0 and anon_user_id=<the previous user_id> for anonymized messages, in other words, completely bypassing the second table:
SELECT * FROM test_msg
WHERE status = 2 AND (user_id = 20 OR anon_user_id = 20)
ORDER BY m.datestamp DESC
LIMIT 20,10
Result: it did not matter. The performance is still within 4 or 5 seconds; forcing to not use an index with USE INDEX() does not speed up this query.
This is were I really got puzzled now. Index will always only be used for the status column, the OR prevents other indices from being used, this is also what the MySQL documentation told me in this regard.
An alternate solution I tried: do not use the test_anon table to only relate to anonymized messages, but simply to all messages. This allows me to write a query like this:
SELECT * FROM test_msg AS m, test_anon AS t
WHERE m.message_id = t.message_id
AND t.user_id = 20
AND m.status = 2
ORDER BY m.datestamp DESC
LIMIT 20,10
This query always gave me instant results (== < 0.01 seconds), no matter what LIMIT, etc.
Yes, I've found a solution. I've not yet rewritten the whole application to the model though.
But I'ld like to better understand what the rational is behind my observed behavior (especially forcing no index speeding up queries). On paper nothing looked wrong with the original approach.
Some numbers (they aren't that big anyway):
~ one million messages
message table data size is around 600MB
message table index size is around 350MB
number of anonymized messages in test_anon < 3% of all messages
number of messages from registered users < 25% of all messages
All tables are MyISAM; I tried with InnnoDB but performance was much more worse.
You in fact have two different queries here which are better processed as separate queries.
To improve the LIMIT, you need to use LIMIT on LIMIT technique:
SELECT *
FROM (
SELECT *
FROM test_msg AS m
WHERE m.user_id = 20
AND m.status = 2
ORDER BY
m.datestamp DESC
LIMIT 20
) q1
UNION ALL
SELECT *
(
SELECT m.*
FROM test_msg m
JOIN test_anon a
ON a.message_id = m.message_id
WHERE a.user_id = 20
AND m.user_id = 0
AND m.status = 2
ORDER BY
m.datestamp DESC
LIMIT 20
) q2
ORDER BY
datestamp DESC
LIMIT 20
See this entry in my blog for more detail on this solution:
MySQL: LIMIT on LIMIT
You need to create two composite indexes for this to work fast:
test_msg (status, user_id, datestamp)
test_msg (status, user_id, message_id, datestamp)
Then you need to choose what the index will be used for in the second query: ordering or filtering.
In your query, the index cannot be used for both, since you're filtering on a range on message_id.
See this article for more explainations:
Choosing index
In a couple of words:
If there are lots of anonymous messages from this user, i. e. there is high probability that the message will be found somewhere in the beginning of the index, then the index should be used for sorting. Use the first index.
If there are few anonymous messages from this user, i. e. there is low probability that the message will be found somewhere in the beginning of the index, then the index should be used for filtering. Use the second index.
If there is a possibility to redesign the tables, just add another column is_anonymous to the table test_msg.
It will solve lots of problems.
The problem is that you're doing a join for the entire table. You need to tell the optimizer that you only need to join for two user ID's: zero and your desired user ID. Like this:
SELECT * FROM test_msg AS m
LEFT JOIN test_anon ON test_anon.message_id = m.message_id
WHERE (m.user_id = 20 OR m.user_id = 0)
AND (test_anon.user_id = 20 OR test_anon.user_id IS NULL)
AND m.status = 2
ORDER BY m.datestamp DESC
LIMIT 0,10
Does this work better?