Here's the query:
with contrib as (
select
first_name,
last_name,
user_id,
photo_url
from contributors
where visible = true
group by 1,2,3,4
),
dwm as (
select * from dialogues_with_metadata
),
joined as (
select
c.*,
dwm.dialogue_id
from contrib c
left join dwm on c.user_id = dwm.contributor_one_user_id or c.user_id = dwm.contributor_two_user_id
)
select
first_name,
last_name,
user_id,
photo_url,
count(distinct dialogue_id) as dialogues
from joined
group by 1,2,3,4
order by 3 desc
PostgreSQL database.
CPU usage is 10%, so I don't think that's the problem!
I suspect what's slowing things down is the join statement. How might I reconfigure this query so that it doesn't take ~20 seconds to return less than 300 rows?
Here's the contributors table schema:
user_id (primary key - uuid)
username
first_name
last_name
hash
description
blurb
photo_url
blurb_updated_at
visible
And the dialogues table schema:
dialogue_id (uuid)
contributor_one_uuid
contributor_two_uuid
title
image_url
visible
categories
image_source
current_popularity
created_at
override_url
visible
Ran explain select * from dialogues_with_metadata; Here are the results:
Hash Left Join (cost=167.80..172.34 rows=137 width=880)
Hash Cond: (a.dialogue_id = b.writing_dialogue_id)
CTE main
-> Hash Right Join (cost=64.30..111.60 rows=137 width=504)
Hash Cond: (c2.user_id = d.contributor_two_uuid)
-> Seq Scan on contributors c2 (cost=0.00..43.60 rows=260 width=125)
-> Hash (cost=62.59..62.59 rows=137 width=325)
-> Hash Left Join (cost=46.85..62.59 rows=137 width=325)
Hash Cond: (d.contributor_one_uuid = c1.user_id)
-> Seq Scan on dialogues d (cost=0.00..15.37 rows=137 width=216)
-> Hash (cost=43.60..43.60 rows=260 width=125)
-> Seq Scan on contributors c1 (cost=0.00..43.60 rows=260 width=125)
CTE dialogues_with_installment_counts
-> HashAggregate (cost=50.39..52.00 rows=129 width=28)
Group Key: writings.dialogue_id
-> Seq Scan on writings (cost=0.00..46.82 rows=476 width=28)
Filter: finalized
-> CTE Scan on main a (cost=0.00..2.74 rows=137 width=868)
-> Hash (cost=2.58..2.58 rows=129 width=28)
-> CTE Scan on dialogues_with_installment_counts b (cost=0.00..2.58 rows=129 width=28)
EXPLAIN on updated query:
Seq Scan on contributors c (cost=0.00..4012.89 rows=247 width=109)
Filter: visible
SubPlan 1
-> Aggregate (cost=16.06..16.07 rows=1 width=8)
-> Seq Scan on dialogues (cost=0.00..16.05 rows=2 width=0)
Filter: ((c.user_id = contributor_one_uuid) OR (c.user_id = contributor_two_uuid))
explain (analyze, buffers) select on updated query:
Seq Scan on contributors c (cost=0.00..4205.86 rows=259 width=109) (actual time=0.073..16819.258 rows=260 loops=1)
Filter: visible
Rows Removed by Filter: 13
Buffers: shared hit=3681
SubPlan 1
-> Aggregate (cost=16.06..16.07 rows=1 width=8) (actual time=64.548..64.549 rows=1 loops=260)
Buffers: shared hit=3640
-> Seq Scan on dialogues (cost=0.00..16.05 rows=2 width=0) (actual time=49.155..64.547 rows=1 loops=260)
Filter: ((c.user_id = contributor_one_uuid) OR (c.user_id = contributor_two_uuid))
Rows Removed by Filter: 136
Buffers: shared hit=3640
Planning Time: 0.136 ms
Execution Time: 16819.365 ms
A second time (16 seconds faster!):
Seq Scan on contributors c (cost=0.00..4205.86 rows=259 width=109) (actual time=0.063..801.278 rows=260 loops=1)
Filter: visible
Rows Removed by Filter: 13
Buffers: shared hit=3681
SubPlan 1
-> Aggregate (cost=16.06..16.07 rows=1 width=8) (actual time=3.080..3.080 rows=1 loops=260)
Buffers: shared hit=3640
-> Seq Scan on dialogues (cost=0.00..16.05 rows=2 width=0) (actual time=0.009..3.079 rows=1 loops=260)
Filter: ((c.user_id = contributor_one_uuid) OR (c.user_id = contributor_two_uuid))
Rows Removed by Filter: 136
Buffers: shared hit=3640
Planning Time: 0.127 ms
Execution Time: 801.379 ms
This query should do the same without unnesesary grouping statements
select
c.first_name,
c.last_name,
c.user_id,
c.photo_url,
(
select
count(distinct dialogue_id) from dialogues_with_metadata dwm
where
c.user_id = dwm.contributor_one_user_id
or c.user_id = dwm.contributor_two_user_id
) as dialogues
from contributors c
where c.visible = true
Additionaly it's worth checking if counting number of conversations can be done in more efficient way (what indexes are on this table ?).
Query version with dialogues table
select
c.first_name,
c.last_name,
c.user_id,
c.photo_url,
(
select
count(*) from dialogues
where
c.user_id = dialogues.contributor_one_uuid
or c.user_id = dialogues.contributor_two_uuid
) as dialogues
from contributors c
where c.visible = true
Another version
select
c.first_name,
c.last_name,
c.user_id,
c.photo_url,
s.dialogues
from contributors c
join (
select count(*) dialogues, user_id from (
select contributor_one_uuid user_id from dialogues
union all
select contributor_two_uuid from dialogues
) stats group by user_id
) s on (s.user_id = c.user_id)
where c.visible = true
Related
What makes performance vary for the same query? My DB has just ~10 tables and no more than a few thousand rows.
Here's the query:
select
c.first_name,
c.last_name,
c.user_id,
c.photo_url,
s.dialogues
from contributors c
join (
select count(*) dialogues, user_id from (
select contributor_one_uuid user_id from dialogues
union all
select contributor_two_uuid from dialogues
) stats group by user_id
) s on (s.user_id = c.user_id)
where c.visible = true
#1: It takes almost 17 seconds to execute! explain (analyze, buffers) select on the query:
Seq Scan on contributors c (cost=0.00..4205.86 rows=259 width=109) (actual time=0.073..16819.258 rows=260 loops=1)
Filter: visible
Rows Removed by Filter: 13
Buffers: shared hit=3681
SubPlan 1
-> Aggregate (cost=16.06..16.07 rows=1 width=8) (actual time=64.548..64.549 rows=1 loops=260)
Buffers: shared hit=3640
-> Seq Scan on dialogues (cost=0.00..16.05 rows=2 width=0) (actual time=49.155..64.547 rows=1 loops=260)
Filter: ((c.user_id = contributor_one_uuid) OR (c.user_id = contributor_two_uuid))
Rows Removed by Filter: 136
Buffers: shared hit=3640
Planning Time: 0.136 ms
Execution Time: 16819.365 ms
#2. It takes not even a second to execute!
Seq Scan on contributors c (cost=0.00..4205.86 rows=259 width=109) (actual time=0.063..801.278 rows=260 loops=1)
Filter: visible
Rows Removed by Filter: 13
Buffers: shared hit=3681
SubPlan 1
-> Aggregate (cost=16.06..16.07 rows=1 width=8) (actual time=3.080..3.080 rows=1 loops=260)
Buffers: shared hit=3640
-> Seq Scan on dialogues (cost=0.00..16.05 rows=2 width=0) (actual time=0.009..3.079 rows=1 loops=260)
Filter: ((c.user_id = contributor_one_uuid) OR (c.user_id = contributor_two_uuid))
Rows Removed by Filter: 136
Buffers: shared hit=3640
Planning Time: 0.127 ms
Execution Time: 801.379 ms
The engine is performing a SeqScan on contributors and I think that's unavoidable (unless I'm quite mistaken). However, it's also performing a SeqScan on dialogues and this can be prevented with a lateral join.
If the table dialogues has indexes on contributor_one_uuid and also on contributor_two_uuid the query can be rephrased. Hopefully this change can speed it up:
select
c.first_name,
c.last_name,
c.user_id,
c.photo_url,
s.dialogues
from contributors c,
lateral (
select (select count(*) from dialogues d where d.contributor_one_uuid = c.user_id)
+ (select count(*) from dialogues d where d.contributor_two_uuid = c.user_id)
as dialogues
) s on true
I need to JOIN a table inside a correlated subquery. However, the chosen query plan of postgres is very slow. How can I optimize the following query:
SELECT c.id
FROM customer c
WHERE EXISTS (
SELECT 1
FROM customer_communication cc
JOIN communication co on co.id = cc.communication_id and co.channel <> 'mobile'
WHERE cc.user_id = c.id
)
This is the EXPLAIN (ANALYZE) result:
Nested Loop (cost=3451561.57..3539012.42 rows=24509 width=8) (actual time=60913.294..64056.970 rows=1036309 loops=1)
-> HashAggregate (cost=3451561.14..3451806.23 rows=24509 width=8) (actual time=60913.264..61187.702 rows=1036310 loops=1)
Group Key: cc.customer_id
-> Hash Join (cost=2070834.75..3358538.60 rows=37209016 width=8) (actual time=32758.325..52752.383 rows=37209019 loops=1)
Hash Cond: (cc.communication_id = co.id)
-> Seq Scan on customer_communication cc (cost=0.00..755689.16 rows=37209016 width=16) (actual time=0.011..4949.315 rows=37209019 loops=1)
-> Hash (cost=1772758.38..1772758.38 rows=18168430 width=8) (actual time=32756.662..32756.663 rows=18108924 loops=1)
Buckets: 262144 Batches: 128 Memory Usage: 7557kB
-> Seq Scan on communication co (cost=0.00..1772758.38 rows=18168430 width=8) (actual time=0.007..30024.494 rows=18108924 loops=1)
Filter: (channel <> 'mobile')
-> Index Only Scan using customerxpk on customer c (cost=0.43..3.60 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1036310)
Index Cond: (id = cc.customer_id)
Heap Fetches: 525050
Planning Time: 0.391 ms
Execution Time: 64094.584 ms
I think you have mis-specified the query, because you have conflicting aliases. This might be better:
SELECT c.id
FROM customer c
WHERE EXISTS (SELECT 1
FROM customer_communication cc JOIN
communication co
ON co.id = cc.communication_id AND
co.channel <> 'mobile'
WHERE cc.user_id = c.id
);
Note that in the subquery c refers to the outer query's customer and co refers to communication.
We run a join query between 2 tables.
The query has an OR statement that compares one column from the left table and one column from the right table. The query performance is very low, and we fixed it by changing the OR to UNION.
Why is this happening? I'm looking for a detailed explanation or a reference to the documentation that might shed a light on the issue.
Query with Or Statment:
db1=# explain analyze select count(*)
from conversations
join agents on conversations.agent_id=agents.id
where conversations.id=1 or agents.id = '123';
**Query plan**
----------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=**11017.95..11017.96** rows=1 width=8) (actual time=54.088..54.088 rows=1 loops=1)
-> Gather (cost=11017.73..11017.94 rows=2 width=8) (actual time=53.945..57.181 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=10017.73..10017.74 rows=1 width=8) (actual time=48.303..48.303 rows=1 loops=3)
-> Hash Join (cost=219.26..10016.69 rows=415 width=0) (actual time=5.292..48.287 rows=130 loops=3)
Hash Cond: (conversations.agent_id = agents.id)
Join Filter: ((conversations.id = 1) OR ((agents.id)::text = '123'::text))
Rows Removed by Join Filter: 80035
-> Parallel Seq Scan on conversations (cost=0.00..9366.95 rows=163995 width=8) (actual time=0.017..14.972 rows=131196 loops=3)
-> Hash (cost=143.56..143.56 rows=6056 width=16) (actual time=2.686..2.686 rows=6057 loops=3)
Buckets: 8192 Batches: 1 Memory Usage: 353kB
-> Seq Scan on agents (cost=0.00..143.56 rows=6056 width=16) (actual time=0.011..1.305 rows=6057 loops=3)
Planning time: 0.710 ms
Execution time: 57.276 ms
(15 rows)
Changing the OR to UNION:
db1=# explain analyze select count(*) from (
select *
from conversations
join agents on conversations.agent_id=agents.id
where conversations.installation_id=1
union
select *
from conversations
join agents on conversations.agent_id=agents.id
where agents.source_id = '123') as subquery;
**Query plan:**
----------------------------------------------------------------------------------------------------------------------------------
Aggregate (**cost=1114.31..1114.32** rows=1 width=8) (actual time=8.038..8.038 rows=1 loops=1)
-> HashAggregate (cost=1091.90..1101.86 rows=996 width=1437) (actual time=7.783..8.009 rows=390 loops=1)
Group Key: conversations.id, conversations.created, conversations.modified, conversations.source_created, conversations.source_id, conversations.installation_id, bra
in_conversation.resolution_reason, conversations.solve_time, conversations.agent_id, conversations.submission_reason, conversations.is_marked_as_duplicate, conversations.n
um_back_and_forths, conversations.is_closed, conversations.is_solved, conversations.conversation_type, conversations.related_ticket_source_id, conversations.channel, brain_convers
ation.last_updated_from_platform, conversations.csat, agents.id, agents.created, agents.modified, agents.name, agents.source_id, organizati
on_agent.installation_id, agents.settings
-> Append (cost=219.68..1027.16 rows=996 width=1437) (actual time=5.517..6.307 rows=390 loops=1)
-> Hash Join (cost=219.68..649.69 rows=931 width=224) (actual time=5.516..6.063 rows=390 loops=1)
Hash Cond: (conversations.agent_id = agents.id)
-> Index Scan using conversations_installation_id_b3ff5c00 on conversations (cost=0.42..427.98 rows=931 width=154) (actual time=0.039..0.344 rows=879 loops=1)
Index Cond: (installation_id = 1)
-> Hash (cost=143.56..143.56 rows=6056 width=70) (actual time=5.394..5.394 rows=6057 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 710kB
-> Seq Scan on agents (cost=0.00..143.56 rows=6056 width=70) (actual time=0.014..1.938 rows=6057 loops=1)
-> Nested Loop (cost=0.70..367.52 rows=65 width=224) (actual time=0.210..0.211 rows=0 loops=1)
-> Index Scan using agents_source_id_106c8103_like on agents agents_1 (cost=0.28..8.30 rows=1 width=70) (actual time=0.210..0.210 rows=0 loops=1)
Index Cond: ((source_id)::text = '123'::text)
-> Index Scan using conversations_agent_id_de76554b on conversations conversations_1 (cost=0.42..358.12 rows=110 width=154) (never executed)
Index Cond: (agent_id = agents_1.id)
Planning time: 2.024 ms
Execution time: 9.367 ms
(18 rows)
Yes. or has a way of killing the performance of queries. For this query:
select count(*)
from conversations c join
agents a
on c.agent_id = a.id
where c.id = 1 or a.id = 123;
Note I removed the quotes around 123. It looks like a number so I assume it is. For this query, you want an index on conversations(agent_id).
Probably the most effective way to write the query is:
select count(*)
from ((select 1
from conversations c join
agents a
on c.agent_id = a.id
where c.id = 1
) union all
(select 1
from conversations c join
agents a
on c.agent_id = a.id
where a.id = 123 and c.id <> 1
)
) ac;
Note the use of union all rather than union. The additional where condition eliminates duplicates.
This can take advantage of the following indexes:
conversations(id, agent_id)
agents(id)
conversations(agent_id, id)
I have the following query which takes a little too long to execute. I have posted the EXPLAIN ANALYZE for the query. Anything I can do to improve its speed?
EXPLAIN analyze SELECT c.*, match.user_json FROM match INNER JOIN conversation c
ON match.match_id = c.match_id WHERE c.from_id <> 142822281 AND c.to_id =
142822281 AND c.unix_timestamp = (SELECT max( unix_timestamp ) FROM conversation
WHERE match_id = c.match_id GROUP BY match_id)
EXPLAIN ANALYZE results
Nested Loop (cost=0.00..16183710.79 rows=2 width=805) (actual time=2455.133..2502.781 rows=34 loops=1)
Join Filter: (match.match_id = c.match_id)
Rows Removed by Join Filter: 71502
-> Seq Scan on match (cost=0.00..268.51 rows=2151 width=723) (actual time=0.006..4.973 rows=2104 loops=1)
-> Materialize (cost=0.00..16183377.75 rows=2 width=90) (actual time=0.034..1.168 rows=34 loops=2104)
-> Seq Scan on conversation c (cost=0.00..16183377.74 rows=2 width=90) (actual time=70.972..2421.949 rows=34 loops=1)
Filter: ((from_id <> 142822281) AND (to_id = 142822281) AND (unix_timestamp = (SubPlan 1)))
Rows Removed by Filter: 22010
SubPlan 1
-> GroupAggregate (cost=0.00..739.64 rows=10 width=16) (actual time=5.358..5.358 rows=1 loops=450)
Group Key: conversation.match_id
-> Seq Scan on conversation (cost=0.00..739.49 rows=10 width=16) (actual time=3.355..5.320 rows=17 loops=450)
Filter: (match_id = c.match_id)
Rows Removed by Filter: 22027
Planning Time: 1.132 ms
Execution Time: 2502.926 ms
This is your query:
SELECT c.*, m.user_json
FROM match m INNER JOIN
conversation c
ON m.match_id = c.match_id
WHERE c.from_id <> 142822281 AND
c.to_id = 142822281 AND
c.unix_timestamp = (SELECT max( c2.unix_timestamp )
FROM conversation c2
WHERE c2.match_id = c.match_id
GROUP BY c2.match_id
);
I would suggest writing it as:
SELECT DISTINCT ON (c.match_id) c.*, m.user_json
FROM match m INNER JOIN
conversation c
ON m.match_id = c.match_id
WHERE c.from_id <> 142822281 AND
c.to_id = 142822281 AND
ORDER BY c.match_id, c.unix_timestamp DESC;
Then try an index on: conversation(to_id, from_id, match_id). I assume you have an index on match(match_id).
I've written the following PostgreSQL query which works as it should. However, it seems to be awfully slow, sometimes taking up to 10 seconds to return a result. I'm sure there is something in my statement that is causing this to be slow.
Can anyone help determine why this query is slow?
SELECT DISTINCT ON (school_classes.class_id,attendance_calendar.school_date)
school_classes.class_id, school_classes.class_name, school_classes.grade_id
, school_gradelevels.linked_calendar, attendance_calendars.calendar_id
, attendance_calendar.school_date, attendance_calendar.minutes
, teacher_join_classes_subjects.staff_id, staff.first_name, staff.last_name
FROM school_classes
INNER JOIN school_gradelevels ON school_gradelevels.id=school_classes.grade_id
INNER JOIN teacher_join_classes_subjects ON teacher_join_classes_subjects.class_id=school_classes.class_id
INNER JOIN staff ON staff.staff_id=teacher_join_classes_subjects.staff_id
INNER JOIN attendance_calendars ON attendance_calendars.title=school_gradelevels.linked_calendar
INNER JOIN attendance_calendar ON attendance_calendar.calendar_id=attendance_calendars.calendar_id
WHERE teacher_join_classes_subjects.syear='2013'
AND staff.syear='2013'
AND attendance_calendars.syear='2013'
AND teacher_join_classes_subjects.does_attendance='Y'
AND teacher_join_classes_subjects.subject_id IS NULL
AND attendance_calendar.school_date<CURRENT_DATE
AND attendance_calendar.school_date NOT IN (
SELECT com.school_date FROM attendance_completed com
WHERE com.class_id=school_classes.class_id
AND (com.period_id='101' AND attendance_calendar.minutes>='151' OR
com.period_id='95' AND attendance_calendar.minutes='150') )
I replaced the NOT IN with the following:
AND NOT EXISTS (
SELECT com.school_date
FROM attendance_completed com
WHERE com.class_id=school_classes.class_id
AND com.school_date=attendance_calendar.school_date
AND (com.period_id='101' AND attendance_calendar.minutes>='151' OR
com.period_id='95' AND attendance_calendar.minutes='150') )
Result of EXPLAIN ANALYZE:
Unique (cost=2998.39..2998.41 rows=3 width=85) (actual time=10751.111..10751.118 rows=1 loops=1)
-> Sort (cost=2998.39..2998.40 rows=3 width=85) (actual time=10751.110..10751.110 rows=2 loops=1)
Sort Key: school_classes.class_id, attendance_calendar.school_date
Sort Method: quicksort Memory: 25kB
-> Hash Join (cost=2.03..2998.37 rows=3 width=85) (actual time=6409.471..10751.045 rows=2 loops=1)
Hash Cond: ((teacher_join_classes_subjects.class_id = school_classes.class_id) AND (school_gradelevels.id = school_classes.grade_id))
Join Filter: (NOT (SubPlan 1))
-> Nested Loop (cost=0.00..120.69 rows=94 width=81) (actual time=2.468..1187.397 rows=26460 loops=1)
Join Filter: (attendance_calendars.calendar_id = attendance_calendar.calendar_id)
-> Nested Loop (cost=0.00..42.13 rows=1 width=70) (actual time=0.087..3.247 rows=735 loops=1)
Join Filter: ((attendance_calendars.title)::text = (school_gradelevels.linked_calendar)::text)
-> Nested Loop (cost=0.00..40.80 rows=1 width=277) (actual time=0.077..1.005 rows=245 loops=1)
-> Nested Loop (cost=0.00..39.61 rows=1 width=27) (actual time=0.064..0.572 rows=49 loops=1)
-> Seq Scan on teacher_join_classes_subjects (cost=0.00..10.48 rows=4 width=14) (actual time=0.022..0.143 rows=49 loops=1)
Filter: ((subject_id IS NULL) AND (syear = 2013::numeric) AND ((does_attendance)::text = 'Y'::text))
-> Index Scan using staff_pkey on staff (cost=0.00..7.27 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=49)
Index Cond: (staff.staff_id = teacher_join_classes_subjects.staff_id)
Filter: (staff.syear = 2013::numeric)
-> Seq Scan on attendance_calendars (cost=0.00..1.18 rows=1 width=250) (actual time=0.003..0.006 rows=5 loops=49)
Filter: (attendance_calendars.syear = 2013::numeric)
-> Seq Scan on school_gradelevels (cost=0.00..1.15 rows=15 width=11) (actual time=0.001..0.005 rows=15 loops=245)
-> Seq Scan on attendance_calendar (cost=0.00..55.26 rows=1864 width=18) (actual time=0.003..1.129 rows=1824 loops=735)
Filter: (attendance_calendar.school_date Hash (cost=1.41..1.41 rows=41 width=18) (actual time=0.040..0.040 rows=41 loops=1)
-> Seq Scan on school_classes (cost=0.00..1.41 rows=41 width=18) (actual time=0.006..0.015 rows=41 loops=1)
SubPlan 1
-> Seq Scan on attendance_completed com (cost=0.00..958.28 rows=5 width=4) (actual time=0.228..5.411 rows=17 loops=1764)
Filter: ((class_id = $0) AND (((period_id = 101::numeric) AND ($1 >= 151::numeric)) OR ((period_id = 95::numeric) AND ($1 = 150::numeric))))
NOT EXISTS is an excellent choice. Almost always better than NOT IN. More details here.
I simplified your query a bit (which looks fine, generally):
SELECT DISTINCT ON (c.class_id, a.school_date)
c.class_id, c.class_name, c.grade_id
,g.linked_calendar, aa.calendar_id
,a.school_date, a.minutes
,t.staff_id, s.first_name, s.last_name
FROM school_classes c
JOIN teacher_join_classes_subjects t USING (class_id)
JOIN staff s USING (staff_id)
JOIN school_gradelevels g ON g.id = c.grade_id
JOIN attendance_calendars aa ON aa.title = g.linked_calendar
JOIN attendance_calendar a ON a.calendar_id = aa.calendar_id
WHERE t.syear = 2013
AND s.syear = 2013
AND aa.syear = 2013
AND t.does_attendance = 'Y' -- looks like it should be boolean!
AND t.subject_id IS NULL
AND a.school_date < CURRENT_DATE
AND NOT EXISTS (
SELECT 1
FROM attendance_completed x
WHERE x.class_id = c.class_id
AND x.school_date = a.school_date
AND (x.period_id = 101 AND a.minutes >= 151 OR -- actually numbers?
x.period_id = 95 AND a.minutes = 150)
)
ORDER BY c.class_id, a.school_date, ???
What seems to be missing is ORDER BY which should accompany your DISTINCT ON. Add more ORDER BY items in place of ???. If there are duplicates to pick from, you probably want to define which to pick.
Numeric literals don't need single quotes and boolean values should be coded as such.
You may want to revisit the chapter about data types.