Join same column from multiple tables - sql

Below is my current code. I'm not sure what the best way is to amend this to give me the results I need.
SELECT
T1.SC,
T1.AN,
T1.DOFS_DATE,
T2.M_ID,
T3.OPDT,
T4.MARKER,
T5.E_DTE,
T5.E_TME,
T5.E_PST_DTE,
T5.E_AMT,
T5.E_NAR_O,
T5.E_NAR_T
FROM E_Base.AR_MyTable T1
LEFT JOIN E_Base.Translation T2
ON T1.SC = T2.SC
AND T1.AN = T2.AN
LEFT JOIN E_Base.BA T3
ON T2.M_ID = T3.M_ID
LEFT JOIN E_Base.APF T4
ON T3.M_ID = T4.M_ ID
AND MARKER = 54
LEFT JOIN U_DB.TEH_201804 T5
ON T2.M_ID = T5.M_ID
AND T1.DOFS_DATE = T5.E_PST_DTE
QUALIFY ROW_NUMBER() OVER (PARTITION BY T2.M_ID ORDER BY T2.ID_END_DATE DESC, T3.E_END_DATE DESC) = 1
The above code works. However, it is the final left join on T5 where I need help.
In T1 each M_ID has assigned it's own DOFS_DATE that could be any date within the year and I want the data from T5 U_DB.TEH_201804 for the matching date. However, 5 U_DB.TEH_201804 relates to only April 2018. There are 12 tables with the same database (201804, 201805, 201806 etc) that all have the exact same columns but relate to a different month within the year.
Ideally, I want to left join the columns from T5 once but search all 12 tables within the database to bring back the data where the dates correspond.
I was thinking UNION but am unsure how to work this in.
Any help would be greatly appreciated!
Thanks

You could change you code related to table t5 wuth a left join on a subquery that select the union all for all the bale you need ...... (i have named the subquery TT)
SELECT
T1.SC,
T1.AN,
T1.DOFS_DATE,
T2.M_ID,
T3.OPDT,
T4.MARKER,
TT.E_DTE,
TT.E_TME,
TT.E_PST_DTE,
TT.E_AMT,
TT.E_NAR_O,
TT.E_NAR_T
FROM E_Base.AR_MyTable T1
LEFT JOIN E_Base.Translation T2
ON T1.SC = T2.SC
AND T1.AN = T2.AN
LEFT JOIN E_Base.BA T3
ON T2.M_ID = T3.M_ID
LEFT JOIN E_Base.APF T4
ON T3.M_ID = T4.M_ ID
AND MARKER = 54
LEFT JOIN (
select *
FROM U_DB.TEH_201804
UNION ALL
select *
FROM U_DB.TEH_201805
UNION ALL
select *
FROM U_DB.TEH_201806
UNION ALL
select *
FROM U_DB.TEH_201807
UNION ALL
.....
) TT ON T2.M_ID = TT.M_ID
AND T1.DOFS_DATE = TT.E_PST_DTE
QUALIFY ROW_NUMBER() OVER (PARTITION BY T2.M_ID ORDER BY T2.ID_END_DATE DESC, T3.E_END_DATE DESC) = 1

It's hard to tell without additional details like explain and QueryLog step data.
Based on #scaisEdge answer:
You can try to move the first two joins into a Derived Table to apply the ROW_NUMBER early (possible because you do Outer Joins only):
SELECT
dt.*,
T4.MARKER,
TT.E_DTE,
TT.E_TME,
TT.E_PST_DTE,
TT.E_AMT,
TT.E_NAR_O,
TT.E_NAR_T
FROM
(
SELECT
T1.SC,
T1.AN,
T1.DOFS_DATE,
T2.M_ID,
T3.OPDT
FROM E_Base.AR_MyTable T1
LEFT JOIN E_Base.Translation T2
ON T1.SC = T2.SC
AND T1.AN = T2.AN
LEFT JOIN E_Base.BA T3
ON T2.M_ID = T3.M_ID
QUALIFY Row_Number()
Over (PARTITION BY T2.M_ID
ORDER BY T2.ID_END_DATE DESC, T3.E_END_DATE DESC) = 1
) AS dt
LEFT JOIN E_Base.APF T4
ON dt.M_ID = T4.M_ID
AND MARKER = 54
LEFT JOIN
(
SELECT *
FROM U_DB.TEH_201804
UNION ALL
SELECT *
FROM U_DB.TEH_201805
UNION ALL
SELECT *
FROM U_DB.TEH_201806
UNION ALL
SELECT *
FROM U_DB.TEH_201807
UNION ALL
.....
) TT
ON dt.M_ID = TT.M_ID
AND dt.DOFS_DATE = TT.E_PST_DTE
It might also help the optimizer to provide additional info about the data ranges. Those tables should have CHECK-constraints to tell the optimizer that they contain only data from a single month, if they don't exist try adding a WHERE-condition to each Select, e.g. WHERE E_PST_DTE BETWEEN DATE '2018-04-01' AND DATE '2018-04-30'.
Of course, always check Explain if the plan actually changes...

Related

SQL split repeating rows caused by UNION

I am writing a query to look through and get two seperate averages based on where conditions.
I tried two select statetments but ended up with lots of duplicates.
Now I have a union which works pretty well, although I have my two fields in alternating rows instead of seperate columns.
Can anyone suggest a fix, sorry for the dodgy code!
SELECT
tblSkillName.skillName,
tblTestScores.skillUID,
AVG(tblTestScores.percentage) AS `cohortPercentage`
FROM
(
(
(
tblTestScores
INNER JOIN tblUsers ON tblUsers.email = tblTestScores.email
)
INNER JOIN tblTestDetails ON tblTestScores.testDetailsID = tblTestDetails.testDetailsID
)
INNER JOIN tblSkillName ON tblSkillName.skillUID = tblTestScores.skillUID
)
WHERE
teacherGroup = '9JS2/Cp'
AND tblTestScores.testDetailsID = 1
GROUP BY
skillName
UNION ALL
SELECT
tblSkillName.skillName,
tblTestScores.skillUID,
AVG(tblTestScores.percentage) AS `groupPercentage`
FROM
(
(
(
tblTestScores
INNER JOIN tblUsers ON tblUsers.email = tblTestScores.email
)
INNER JOIN tblTestDetails ON tblTestScores.testDetailsID = tblTestDetails.testDetailsID
)
INNER JOIN tblSkillName ON tblSkillName.skillUID = tblTestScores.skillUID
)
WHERE
tblTestScores.testDetailsID = 1
GROUP BY
skillName
ORDER BY
skillUID ASC

Query result not working with array type in PostgreSQL

I have two tables which contains a column with data type array in PostgreSQL. The structure is like below:
tbl_tour_packages
tbl_header_images
I have a query which contains several joins. The query is working fine with other joins and showing no error. But missing the values from tbl_header_images.
The query is:
SELECT
t1.tour_id AS pid,
t1.tour_name AS title,
t1.tour_duration AS nights,
t1.tour_price_full AS price,
t1.discount AS discount,
t1.tour_seo_title AS seo,
t3.category AS category,
t4.image_names[1] AS image_url,
CASE WHEN max(s.state_name) IS NULL THEN NULL ELSE array_agg(s.state_name) END AS state,
CASE WHEN max(o.destination) IS NULL THEN NULL ELSE array_agg(o.destination) END AS destinations
FROM tbl_tour_packages t1
LEFT JOIN tbl_countries t2 ON t1.tour_country_iso = t2.iso
LEFT JOIN tbl_categories t3 on t1.tour_category_id = t3.id
LEFT JOIN tbl_header_images t4 ON t1.tour_id = t4.package_id
LEFT JOIN tbl_states AS s ON (t1.tour_state #> array[s.state_code])
LEFT JOIN tbl_destinations AS o ON (t1.tour_destination #> array[o.id])
WHERE t1.tour_status = 1
GROUP BY 1,7,8
ORDER BY view_count ASC LIMIT 6
I want to get the 'image_name' from tbl_header_images. Any quick help or suggestion will be appreciated.
before WHERE clause you should be able to do something like:
, unnest(image_names) _image_names
and then in select statement aggregate that back into an array
array_agg(_image_names) AS image_names
I don't quite get the t4.image_names[1] AS image_url attempt, but I'm sure you can pick it up from here.
so the whole query would be something like:
edit: I've stripped extra groupping
SELECT
t1.tour_id AS pid,
t1.tour_name AS title,
t1.tour_duration AS nights,
t1.tour_price_full AS price,
t1.discount AS discount,
t1.tour_seo_title AS seo,
t3.category AS category,
(array_agg(_image_names))[1] AS image_url,
CASE WHEN max(s.state_name) IS NULL THEN NULL ELSE array_agg(s.state_name) END AS state,
CASE WHEN max(o.destination) IS NULL THEN NULL ELSE array_agg(o.destination) END AS destinations
FROM tbl_tour_packages t1
LEFT JOIN tbl_countries t2 ON t1.tour_country_iso = t2.iso
LEFT JOIN tbl_categories t3 on t1.tour_category_id = t3.id
LEFT JOIN tbl_header_images t4 ON t1.tour_id = t4.package_id
LEFT JOIN tbl_states AS s ON (t1.tour_state #> array[s.state_code])
LEFT JOIN tbl_destinations AS o ON (t1.tour_destination #> array[o.id])
, unnest(t4.image_names) AS _image_names
WHERE t1.tour_status = 1
GROUP BY 1,7
ORDER BY view_count ASC LIMIT 6
alternatively I'd go with subselect:
SELECT t1.*,
(SELECT image_names[1] FROM tbl_header_images WHERE package_id = t1.tour_id) AS image_url
FROM t1, t2, t3
WHERE ...

How to join 100 random rows from table 1 multiple other tables in oracle

I have scrapped my previous question as I did not do a good job explaining. Maybe this will be simpler.
I have the following query.
Select * from comp_eval_hdr, comp_eval_pi_xref, core_pi, comp_eval_dtl
where comp_eval_hdr.START_DATE between TO_DATE('01-JAN-16' , 'DD-MON-YY')
and TO_DATE('12-DEC-17' , 'DD-MON-YY')
and comp_eval_hdr.COMP_EVAL_ID = comp_eval_dtl.COMP_EVAL_ID
and comp_eval_hdr.COMP_EVAL_ID = comp_eval_pi_xref.COMP_EVAL_ID
and core_pi.PI_ID = comp_eval_pi_xref.PI_ID
and core_pi.PROGRAM_CODE = 'PS'
Now if I only want a random 100 rows from the comp_eval_hdr table to join with the other tables how would I go about it? If it makes it easier you can disregard the comp_eval_dtl table.
I think you are pretty much there. You just need subqueries, table aliases, and JOIN conditions:
SELECT . . .
FROM (SELECT a.*
FROM (SELECT a.*
FROM a
WHERE a.START_DATE BEWTWEEN DATE '2016-01-01' AND DATE '2017-12-12'
ORDER BY DBMS_RANDOM.VALUE
) a
WHERE ROWNUM <= 100
) a JOIN
mapping m
ON a.? = m.? JOIN
b
ON m.? = b.?;
The ? is just a placeholder for the join columns.
It's a bit of a stretch to know what you want with the question as written but here's my attempt.
WITH rand_list AS
(SELECT * FROM comp_eval_hdr
WHERE comp_eval_hdr.START_DATE BEWTWEEN TO_DATE('01-JAN-16' , 'DD-MON-YY') AND TO_DATE('12-DEC-17' , 'DD-MON-YY')
ORDER BY DBMS_RANDOM.VALUE)
first_100 AS
(SELECT *
FROM rand_list
WHERE ROWNUM <=100)
SELECT md.col_1, t3.col_a
FROM first_100 md
INNER JOIN
table2 t2 ON md.id_column = t2.fk_comp_eval_hdr_id
INNER JOIN
table3 t3 ON t3.id_column = t2.fk_table3_id
You haven't given any indication how they join or the table names and obviously I haven't run this against any mock tables.
You've got a list of randomised records with RAND_LIST which you could, if you wanted, combine with the FIRST_100 query (your choice).
The main query then just joins that through your mapping table (T2) to your 'multiples' table (T3).
how does table 2 look like?...Let me put one example as person table and order table?
select * from (
select * from person ps , order order where ps.city = 'mumbai' and ps.id = order.purchasedby ) porder where porder.rownum <= 100
I did not tested it but it will look something like this.

select the latest date from joins

I am facing problem in join. I am getting duplicate records.Please help me to resolve that..
This is my query:
select requestinstanceid from
requestidt [RIM]
inner join requestcdt [RCDT] on [RIM].requestinstanceid = [RCDT].requestinstanceid
left join requestcmt [RCMT] on [RCDT].requestcommentid = [RCMT].requestcommentid
inner join requestddt [RDDT] on [RDDT].requestinstanceid = [RIM].requestinstanceid
left join requestdmt [RDMT] on [RDMT].requestdocumentid = [RDDT].requestdocumentid
I am getting result like this:
requestinstanceid
184
184
386
389
389
397
I should get not get duplicate record and I want to get the latest date from each record.
This code joins to the 'top 1' of a derived table. You should be able to work out how it could be applied to join to the Top 1 of sub-query sorted by date DESC.
SELECT T2.TempEmailID, T1.EmailID
FROM tbl1 T1
LEFT JOIN (SELECT *, RANK() Over (Partition By EmailID Order By TempEmailID DESC) as TopOne FROM tbl2) T2 ON T1.EmailID = T2.EmailID AND TopOne = 1

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...