I have 4 tables
A user account
user_id | username | password
---------+----------+----------
A projects table
project_id | project_name | category_id
------------+------------------------------+-------------
A user_projects table (many to many relationship)
accounts_projects_id | account_id | project_id
----------------------+------------+------------
A project_messages table (a project will have many messages)
message_id | project_id |message| username
------------+------------+--------+---------
At login, I'm running a query where I fetch the number of projects a user belongs to and the messages for each project using the below query
SELECT account.user_id,account.username,
array_agg(json_build_object('message',project_messages.message,'username',project_messages.username)) AS messages,
project.project_name
FROM account
JOIN accounts_projects ON account.user_id = accounts_projects.account_id
JOIN project_messages ON accounts_projects.project_id = project_messages.project_id
JOIN project ON project.project_id = accounts_projects.project_id
WHERE account.username=$1
GROUP BY project.project_name,account.user_id
this gives me the below output
userid,username, messages (json array object),project_name`
87;"kannaj";"{"{\"message\" : \"saklep\", \"username\" : \"kannaj\"}"}";"Football with Javascript"
87;"kannaj";"{"{\"message\" : \"work\", \"username\" : \"kannaj\"}","{\"message\" : \"you've been down to long in the midnight sea\", \"username\" : \"kannaj\"}","{\"message\" : \"Yeaaaa\", \"username\" : \"house\"}"}";"Machine Learning with Python"
87;"kannaj";"{"{\"message\" : \"holyy DIVVEERRR\", \"username\" : \"kannaj\"}"}";"Beethoven with react"
Is there a way I can use the LIMIT/OFFSET function when retrieving the messages from the project_messages table?
To make our examples simpler lets say we have two linked tables:
t1(id);
t2(id, t1_id);
And query is
select t1.id, array_agg(t2.id)
from t1 join t2 on (t1.id = t2.t1_id)
group by t1.id;
It is very simplified variant of the your large query as you can see.
1) Arrays
select t1.id, (array_agg(t2.id order by t2.id desc))[3:5]
from t1 join t2 on (t1.id = t2.t1_id)
group by t1.id;
This query works just as original, but returns only from 3,4 and 5 elements of the array which is equal to offset 2 limit 3.
2) Subquery and lateral
select
t1.id,
array_agg(t.x)
from
t1 join lateral
(select t2.id as x from t2 where t1.id = t2.t1_id order by t2.id desc offset 2 limit 3) t on (true)
group by t1.id;
Here lateral keyword allows to use fields from other tables mentioned in the main from clause in our subquery (t1.id).
Related
I have a client table which with a foreign key to itself where each client has a specific id in each department but one master id. I am trying to find the most efficient way to restrict my query to just the master entry.
Here are the two (simplified) queries I have that work but I feel like there is a more efficient way to accomplish this especially when joining to other large tables:
-- version 1
select
client.id
from
client
join client client2 on client.id = client2.masterid
and client2.id = client2.masterid
--version 2
select
client.id,
from
client
where
client.id = client.masterid
-- Expanded view
select
t1.id masterid,
t1.dob dob,
trunc((months_between(trunc(sysdate),t1.dob)/12),0) age,
case
when substr(t1.zip,1,5) in ('48502','48503','48504','48505','48506','48507','48529','48532') then null
else
(select
max(audit1.operationid)
from
t2 audit1
where
t1.id = audit1.sourceid
and audit1.fieldname = 'ZIP'
and substr(audit1.oldvalue,1,5) in ('48502','48503','48504','48505','48506','48507','48529','48532')
and audit1.created >= to_date('04/25/2014', 'MM/DD/YYYY')
and 1 < (
select
count(audr.id)
from
t2 audr
WHERE
audr.operationid = audit1.operationid
and audr.fieldname in ('ADDRESS1','CITY')
)
) end auditref,
t1.address1 addr1,
t1.address2 addr2,
t1.city city,
substr(t1.zip,1,5) zip
from
t1
where
t1.id = t1.masterid
and 1 = case
when substr(t1.zip,1,5) in ('48502','48503','48504','48505','48506','48507','48529','48532') then 1
when substr(t1.zip,1,5) not in ('48502','48503','48504','48505','48506','48507','48529','48532') and exists
(select
1
from
t2 audit2
where
audit2.sourceid = t1.id
and audit2.fieldname = 'ZIP'
and substr(audit2.oldvalue,1,5) in ('48502','48503','48504','48505','48506','48507','48529','48532')
and audit2.created >= to_date('04/25/2014', 'MM/DD/YYYY')
) then 1
else 0
end
Any thoughts would be appreciated as any other ways I have tried these joins have caused duplicate rows as there can be many ids for each masterid.
Edit:
Here is a more expanded version of the query but there are more joins and filters being used where using the client.id = client.masterid is causing the query to run much slower
The question is the most effective way to limit the t1 and t2 table scans as these tables are huge...
Using the following join accomplished the goal of limiting the table scans:
from
client
left join client client1 on client1.masterid = client.id and client1.id is null
I have two tables named [DrugPrescriptionEdition] and [PrescriptionDoseDetail] and now, I join that two tables using the below query and taking a result set.
select * from DrugPrescription dp where id in(
SELECT distinct dpe.template
FROM [DrugPrescriptionEdition] dpe
join PrescriptionDoseDetail pdd on pdd.prescription = dpe.id
where doseEnd_endDate is NULL and doseEnd_doseEndType =1
)
but now I want to take records only contain, (1,2) combination of 'datasource' column and prescription.id should be same.
Example : like records { prescriptionID =4 and there contain ,(1,2) }. I will not consider, only 1 ,or 2 contain records.
Need some expert help to adding this conditions to my above query and modify it .
Expected result : I need to filter out , above query result using this, new condition too.
Let me assume your records are in a single table. Here is one method:
select t.*
from t
where (t.dataSource = 1 and
exists (select 1
from t t2
where t2. prescriptionid = t.prescriptionid and
t2.dataSource = 2
)
) or
(t.dataSource = 2 and
exists (select 1
from t t2
where t2.prescriptionid = t.prescriptionid and
t2.dataSource = 2
)
);
It is unclear if any other data sources are allowed. If they are not, then add:
and
not exists (select 1
from t t3
where t3.prescriptionid = t.prescriptionid and
t3.dataSource not in (1, 2)
)
please if I want to get a MAX(changeNo) row for each "number" entry (when note is a JOIN column from another table), like query:
SELECT T0.id, T0.number, T0.date, T0.changeNo, T1.note
FROM
table T0 INNER JOIN table2 T1 ON T0.joinID = T1.joinID
WHERE
changeNo = (SELECT max(changeNo) FROM table)
but this takes the max from the "global max", not from the max for each same "number" entry.
input data:
id|number|date|changeNo|note
01|150052|1603|00000001|0x22
02|150052|1603|00000002|0x45
03|150052|1603|00000003|0x64
04|150053|1603|00000001|0x89
05|150053|1603|00000002|0x56
06|150054|1603|00000001|0x77
07|150054|1603|00000002|0x84
08|150055|1603|00000001|0x46
expected output:
id|number|date|changeNo|note
03|150052|1603|00000003|0x64
05|150053|1603|00000002|0x56
07|150054|1603|00000002|0x84
08|150055|1603|00000001|0x46]
You want a correlated subquery:
SELECT T0.id, T0.number, T0.date, T0.changeNo, T1.note
FROM table T0 INNER JOIN
table2 T1
ON T0.joinID = T1.joinID
WHERE t0.changeNo = (SELECT max(t0.changeNo) FROM table tt0 WHERE tt0.joinID = t0.joinID) ;
This returns the maximum for each joinID. Note: There are other ways to implement such a query. Often the correlated subquery is the fastest method with the right indexes, particularly one on table(joinID, changeNo).
I have a working sql select, which looks like this
[Edited: Im sorry i did one mistake in the question, i edited alias of Table1 but im trying the answers]
SELECT
m.Column1
,t2.Column2
,COALESCE
(
(
SELECT TOP 1 Vat
FROM LinkedDBServer.DatabaseName.dbo.TableName t3
WHERE
m.MaterialNumber = t3.MaterialNumber COLLATE Czech_CI_AS
and t3.Currency = …
and ...
ORDER BY [Date] DESC
), m.Vat
) as Vat
FROM Table1 m
JOIN Table2 t2 on (m.Column1 = t2.Column1)
It works but the problem is that it takes too long and LinkedServer cut my connection because it takes more than 10 minutes. The purpose of the query is to get newer data from a different database if it exists (i get newest data by top and ordering it by date and precondition is that every data in that database is newer than in mine, thats why im using COALESCE).
But my though is if I was able to rewrite it to JOIN it could be faster. But another problem could be I dont have an primary key (and cant change that).
How can I speed that query up ? (Im using SQL Server 2008 R2)
Thank you
Here i attached Estimated Query Plan: (Its readable in browser ZOOM :) Estimation is for 2 Coalesce columns.
Try rewriting query using outer apply
SELECT
t1.Column1
,t2.Column2
,COALESCE(ou.vat, m.Vat) as Vat
FROM Table1 t1
JOIN Table2 m on (m.Column1 = t1.Column1)
outer apply
(
SELECT TOP 1 Vat
FROM LinkedDBServer.DatabaseName.dbo.TableName t3
WHERE
m.MaterialNumber = t3.MaterialNumber COLLATE Czech_CI_AS
and t3.Currency = …
and ...
ORDER BY [Date] DESC
) ou
Another option:
; WITH vat AS (
SELECT MaterialNumber COLLATE Czech_CI_AS As MaterialNumber
, Vat
, Row_Number() OVER (PARTITION BY MaterialNumber ORDER BY "Date" DESC) As sequence
FROM LinkedDBServer.DatabaseName.dbo.TableName
WHERE Currency = ...
AND ...
)
SELECT t1.Column1
, m.Column2
, Coalesce(vat.Vat, m.Vat) As Vat
FROM Table1 As t1
INNER
JOIN Table2 As m
ON m.Column1 = t1.Column1
LEFT
JOIN vat
ON vat.MaterialNumber = m.MaterialNumber
AND vat.sequence = 1
;
Lets say I have a table e.g
Request No. Type Status
---------------------------
1 New Renewed
and then another table
Action ID Request No LastUpdated
------------------------------------
1 1 06-10-2010
2 1 07-14-2010
3 1 09-30-2010
How can I join the second table with the first table but only get the latest record from the second table(e.g Last Updated DESC)
SELECT T1.RequestNo ,
T1.Type ,
T1.Status,
T2.ActionId ,
T2.LastUpdated
FROM TABLE1 T1
JOIN TABLE2 T2
ON T1.RequestNo = T2.RequestNo
WHERE NOT EXISTS
(SELECT *
FROM TABLE2 T2B
WHERE T2B.RequestNo = T2.RequestNo
AND T2B.LastUpdated > T2.LastUpdated
)
Using aggregates:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
JOIN (SELECT t.request_no,
MAX(t.lastupdated) AS latest
FROM REQUEST_EVENTS t
GROUP BY t.request_no) x ON x.request_no = re.request_no
AND x.latest = re.lastupdated
Using LEFT JOIN & NOT EXISTS:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
WHERE NOT EXISTS(SELECT NULL
FROM REQUEST_EVENTS re2
WHERE re2.request_no = r2.request_no
AND re2.LastUpdated > re.LastUpdated)
SELECT *
FROM REQUEST, ACTION
WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO --Joining here
AND ACTION.LastUpdated = (SELECT MAX(LastUpdated) FROM ACTION WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO);
A sub-query is used to get the last updated record's date and matches against itself to prevent the other records being joined.
Granted, depending on how precise the LastUpdated field is, it can have problems with two records being updated on the same date, but that is a problem encountered in any other implementation, so the precision would have to be increased or some other logic would have to be in place or another distinguishing characteristic to prevent multiple rows being returned.
SELECT r.RequestNo, r.Type, r.Status, a.ActionID, MAX(a.LastUpdated)
FROM Request r
INNER JOIN Action a ON r.RequestNo = a.RequestNo
GROUP BY r.RequestNo, r.Type, r.Status, a.ActionID
We can use the operation Top 1 with ORDER BY clause. For instance, if your tables are RequestTable(ID,Type,Status) and ActionTable(ActionID,RequestID,LastUpdated), the query will be like this:
Select Top 1 rq.ID, rq.Status, at.ActionID
From RequestTable as rq
JOIN ActionTable as at ON rq.ID = at.RequestID
Order by at.LastUpdated DESC