How to simplify nested SQL cross join? - sql

I'm using Postgres 9.3 and have the following four tables to have maximum flexibility regarding price and / or tax / taxe rate changes in the future (see below for more details):
CREATE TABLE main.products
(
id serial NOT NULL,
"productName" character varying(255) NOT NULL,
"productStockAmount" real NOT NULL,
)
CREATE TABLE main."productPrices"
(
id serial NOT NULL,
product_id integer NOT NULL,
"productPriceValue" real NOT NULL,
"productPriceValidFrom" timestamp without time zone NOT NULL,
)
CREATE TABLE main."productTaxes"
(
id serial NOT NULL,
product_id integer NOT NULL,
"productTaxValidFrom" timestamp without time zone NOT NULL,
"taxRate_id" integer NOT NULL,
)
CREATE TABLE main."taxRateValues"
(
id integer NOT NULL,
"taxRate_id" integer NOT NULL,
"taxRateValueValidFrom" timestamp without time zone NOT NULL,
"taxRateValue" real,
)
I built a view based on the following query to get the currently relevant values:
SELECT p.id, p."productName", p."productStockAmount", sub."productPriceValue", CHR(64+sub3."taxRate_id") AS taxRateId, sub3."taxRateValue" FROM main."products" p
CROSS JOIN LATERAL (SELECT * FROM main."productPrices" pp2 WHERE pp2."product_id"=p."id" AND pp2."productPriceValidFrom" <= NOW() ORDER BY pp2."productPriceValidFrom" DESC LIMIT 1) AS sub
CROSS JOIN LATERAL (SELECT * FROM main."productTaxes" pt WHERE pt."product_id"=p."id" AND pt."productTaxValidFrom" <= NOW() ORDER BY pt."productTaxValidFrom" DESC LIMIT 1) AS sub2
CROSS JOIN LATERAL (SELECT * FROM main."taxRateValues" trv WHERE trv."taxRate_id"=sub2."taxRate_id" AND trv."taxRateValueValidFrom" <= NOW() ORDER BY trv."taxRateValueValidFrom" DESC LIMIT 1) AS sub3
This works fine and gives me the correct results but I assume to get performance problems if several thousand products, price changes etc. are in the database.
Is there anything I can do to simplify the statement or the overall database design?
To use words to describe the needed flexibility:
Prices can be changed and I have to record which price is valid to which time (archival, so not only the current price is needed)
Applied tax rates for products can be changed (e.g. due to changes by law) - archival also needed
Tax rates in general can be changed (also by law, but not related to a single product but all products with this identifier)
Some examples of things that can happen:
Product X changes price from 100 to 200 at 2014-05-09
Product X changes tax rate from A to B at 2014-07-01
Tax rate value for tax rate A changes from 16 to 19 at 2014-09-01

As long as you fetch all rows or more than a few percent of all rows, it will be substantially faster to first aggregate once per table, and then join.
I suggest DISTINCT ON to pick the latest valid row per id:
SELECT p.id, p."productName", p."productStockAmount"
,pp."productPriceValue"
,CHR(64 + tr."taxRate_id") AS "taxRateId", tr."taxRateValue"
FROM main.products p
LEFT JOIN (
SELECT DISTINCT ON (product_id)
product_id, "productPriceValue"
FROM main."productPrices"
WHERE "productPriceValidFrom" <= now()
ORDER BY product_id, "productPriceValidFrom" DESC
) pp ON pp.product_id = p.id
LEFT JOIN (
SELECT DISTINCT ON (product_id)
product_id, "taxRate_id"
FROM main."productTaxes"
WHERE "productTaxValidFrom" <= now()
ORDER BY product_id, "productTaxValidFrom" DESC
) pt ON pt.product_id = p.id
LEFT JOIN (
SELECT DISTINCT ON ("taxRate_id") *
FROM main."taxRateValues"
WHERE "taxRateValueValidFrom" <= now()
ORDER BY "taxRate_id", "taxRateValueValidFrom" DESC
) tr ON tr."taxRate_id" = pt."taxRate_id";
Using LEFT JOIN to be on the safe side. Not every product might have entries in all sub-tables.
And I subscribe to what #Clodoaldo wrote about double-quoted identifiers. I never use anything but legal, lower-case names. Makes your life with Postgres easier.
Detailed explanation for DISTINCT ON:
Select first row in each GROUP BY group?

Do not create quoted identifiers. Once you do it you are forever stuck with them and you will have to quote and remember the casing everywhere. You can use camel case whenever you want if you don't quote the identifier at creation time.
I don't understand why you need the cross lateral. I think it can be just
select
p.id,
p."productName",
p."productStockAmount",
pp2."productPriceValue",
chr(64 + trv."taxRate_id") as "taxRateId",
trv."taxRateValue"
from
main."products" p
left join (
select *
from main."productPrices"
where "productPriceValidFrom" <= now()
order by "productPriceValidFrom" desc
limit 1
) pp2 on pp2."product_id" = p."id"
left join (
select "taxRate_id"
from main."productTaxes"
where "productTaxValidFrom" <= now()
order by "productTaxValidFrom" desc
limit 1
) pt on pt."product_id" = p."id"
left join (
select *
from main."taxRateValues"
where "taxRateValueValidFrom" <= now()
order by "taxRateValueValidFrom" desc
limit 1
) trv on trv."taxRate_id" = pt."taxRate_id"

Related

Selecting current valid record of historical Data with SQL

I have 2 tables
Table Customer
customer_shortcut (char)
Table CustomerData
customerID (ForeignKey to Customer)
customer_valid (Valid date for the record)
customer_name (char)
Table CustomerData can have multiple records for a customer, but with different valid dates, p.e.
01.01.2019
01.01.2020
01.01.2021
I managed to get the last record for each customer using the query:
SELECT Customer.*
FROM Customer
FULL JOIN CustomerData ON (Customer.id = CustomerData."customerID_id")
FULL JOIN CustomerData CustomerData2 ON (Customer.id = CustomerData2."customerID_id"
AND (CustomerData.customer_valid < CustomerData2.customer_valid
OR CustomerData.customer_valid = CustomerData2.customer_valid
AND CustomerData.id < CustomerData2.id)
)
WHERE CustomerData2.id IS NULL
How do I get now the current valid record (in my example the record with customer_valid 01.01.2020)?
I tried to add "AND customer_valid <= '2020-05-05' on nearly every position within the query but never got the expected result.
If I understand you correctly you are looking for the highest "valid date" that is before "today" (or any given date). This can be achieved using a lateral join in Postgres:
SELECT c.*, cd.customer_name
FROM customer c
JOIN LATERAL (
SELECT *
FROM customerdata cd
WHERE c.id = cd.customer_id
AND cd.customer_valid <= current_date
ORDER BY cd.customer_valid DESC
LIMIT 1
) cd on true
A more efficient option would be (in my opinion) to store the start and the end of the valid period in a daterange column:
create table customer_data
(
customer_id int not null references customer,
valid_during daterange not null,
customer_name text
);
Overlapping ranges can be prevented using an exclusion constraint
And the example ranges from your question would be stored as
[2019-01-01,2020-01-01)
[2020-01-01,2021-01-01)
[2021-01-01,infinity)
The ) denotes that the right edge is excluded.
The query then becomes as simple as:
SELECT c.*, cd.customer_name
FROM customer c
JOIN customer_data cd
on c.id = cd.customer_id
AND cd.valid_during #> current_date;

How to check if time period is fully covered by smaller periods in SQL?

Having tables like the following:
create table originalPeriods (
[Id] INT PRIMARY KEY,
[Start] DATETIME NOT NULL,
[End] DATETIME NOT NULL,
[Flag1] INT NOT NULL,
[Flag2] INT NOT NULL,
CONSTRAINT UC_UniueFlags UNIQUE (Flag1,Flag2)
)
go
create table convertedPeriods(
[Id] INT PRIMARY KEY,
[Start] DATETIME NOT NULL,
[End] DATETIME NOT NULL,
[Flag1] INT NOT NULL,
[Flag2] INT NOT NULL
)
go
I want to check whether every period from the first table is represented by a set of periods from the second table with matching Flags.
I want converted periods (from the second table) to fill the whole original period (from the first period) with no empty spaces, no overlapping and no extensions! Converted periods should fit the original period exactly.
The perfect outcome would be a list of original periods Id with the flag of whether it is well covered by converted periods.
Try this, let me know if it works:
select
op.Id
,[Flag: Converted Period matches Original Period] = case when cp.Id is not null then 'Found' else 'Not Found' end
from originalPeriods as op
left join convertedPeriods as cp on cp.[Start] = op.[Start] and cp.[End] = op.[End]
I guess you are lookin for something like this
with RECURSIVE periodsNet as (
select o.id, o.dtstart, o.dtend,
c.dtend as netPoint,
exists(select * from convertedPeriods c2
where (c2.dtstart > c.dtstart and c2.dtstart < c.dtend)
or (c2.dtend > c.dtstart and c2.dtend < c.dtend)
) as hasOverlap
from originalPeriods o
inner join convertedPeriods c
on o.dtstart = c.dtstart
union all
select o.id, o.dtstart as ostart, o.dtend as oend,
c.dtend as netPoint,
exists(select * from convertedPeriods c2
where (c2.dtstart > c.dtstart and c2.dtstart < c.dtend)
or (c2.dtend > c.dtstart and c2.dtend < c.dtend)
) as hasOverlap
from periodsNet o
inner join convertedPeriods c
on o.netPoint = c.dtstart
),
periodsFilled as (
select id, dtstart, dtend,
case when dtend = max(netPoint) then true else false end filled
from periodsNet
group by id, dtstart, dtend
)
select *,
exists(select * from periodsNet n where n.id = p.id and n.hasOverlap) as hasOverlap
from periodsFilled p
see fiddle> https://www.db-fiddle.com/f/jpezmztvj7uFg1PvixaBsh/0
Thank you for your answers but I'm afraid their effectiveness was not sufficient.
For anyone haveing a similar problem in the future - I ended applying two checks:
First would check if for each original period there is:
period starting at the same time
period ending at the same time
each converted (apart from the ending one as in 2.) period has a following period.
SELECT
op.Id
,(SELECT COUNT() FROM convertedPeriods ps WHERE op.Start=ps.Start AND op.Flag1=cp.Flag1 AND op.Flag2=cp.Flag2)
,(SELECT COUNT() FROM convertedPeriods ps WHERE op.End=ps.End AND op.Flag1=cp.Flag1 AND op.Flag2=cp.Flag2)
,(SELECT COUNT(cp.Id) FROM convertedPeriods cp WHERE NOT EXISTS(SELECT 1 FROM convertedPeriods cp2 WHERE cp2.Start=cp.End) AND cp.End <> op.End)
FROM
originalPeriods op
While there can be false positives with this method, there are not false negatives - meaning every correct period representations must pass this test.
Second check was simply to generate random number of timestamps and compare whether they are covered in originals the same as in converted.
Those methods have proven themself to be succesful check for period's coverage comparement against huge amount of data.
The best solution to check if periods is coverd is this one from ask Tom:
from [enter link description here][1]
select * from (
select nmi,
max(invoice_end_date) over(
partition by nmi order by invoice_start_date
) + 1 start_gap,
lead(invoice_start_date) over(
partition by nmi order by invoice_start_date
) - 1 end_gap
from icr_tmp
)
where start_gap <= end_gap;
it works as a charm
[1]: https://asktom.oracle.com/pls/apex/asktom.search?tag=sql-to-find-gaps-in-date-ranges

ORA-00904 "invalid identifier" but identifier exists in query

I'm working in a fault-reporting Oracle database, trying to get fault information out of it.
The main table I'm querying is Incident, which includes incident information. Each record in Incident may have any number of records in the WorkOrder table (or none) and each record in WorkOrder may have any number of records in the WorkLog table (or none).
What I am trying to do at this point is, for each record in Incident, find the WorkLog with the minimum value in the field MXRONSITE, and, for that worklog, return the MXRONSITE time and the REPORTDATE from the work order. I accomplished this using a MIN subquery, but it turned out that several worklogs could have the same MXRONSITE time, so I was pulling back more records than I wanted. I tried to create a subsubquery for it, but it now says I have an invalid identifier (ORA-00904) for WOL1.WONUM in the WHERE line, even though that identifier is in use elsewhere.
Any help is appreciated. Note that there is other stuff in the query, but the rest of the query works in isolation, and this but doesn't work in the full query or on its own.
SELECT
WL1.MXRONSITE as "Date_First_Onsite",
WOL1.REPORTDATE as "Date_First_Onsite_Notified"
FROM Maximo.Incident
LEFT JOIN (Maximo.WorkOrder WOL1
LEFT JOIN Maximo.Worklog WL1
ON WL1.RECORDKEY = WOL1.WONUM)
ON WOL1.ORIGRECORDID = Incident.TICKETID
AND WOL1.ORIGRECORDCLASS = 'INCIDENT'
WHERE (WL1.WORKLOGID IN
(SELECT MIN(WL3.WORKLOGID)
FROM (SELECT MIN(WL3.MXRONSITE), WL3.WORKLOGID
FROM Maximo.Worklog WL3 WHERE WOL1.WONUM = WL3.RECORDKEY))
or WL1.WORKLOGID is null)
To clarify, what I want is:
For each fault in Incident,
the earliest MXRONSITE from the Worklog table (if such a value exists),
For that worklog, information from the associated record from the WorkOrder table.
This is complicated by Incident records having multiple work orders, and work orders having multiple work logs, which may have the same MXRONSITE time.
After some trials, I have found an (almost) working solution:
WITH WLONSITE as (
SELECT
MIN(WLW.MXRONSITE) as "ONSITE",
WLWOW.ORIGRECORDID as "TICKETID",
WLWOW.WONUM as "WONUM"
FROM
MAXIMO.WORKLOG WLW
INNER JOIN
MAXIMO.WORKORDER WLWOW
ON
WLW.RECORDKEY = WLWOW.WONUM
WHERE
WLWOW.ORIGRECORDCLASS = 'INCIDENT'
GROUP BY
WLWOW.ORIGRECORDID, WLWOW.WONUM
)
select
incident.ticketid,
wlonsite.onsite,
wlonsite.wonum
from
maximo.incident
LEFT JOIN WLONSITE
ON WLONSITE.TICKETID = Incident.TICKETID
WHERE
(WLONSITE.ONSITE is null or WLONSITE.ONSITE = (SELECT MIN(WLONSITE.ONSITE) FROM WLONSITE WHERE WLONSITE.TICKETID = Incident.TICKETID AND ROWNUM=1))
AND Incident.AFFECTEDDATE >= TO_DATE ('01/12/2015', 'DD/MM/YYYY')
This however is significantly slower, and also still not quite right, as it turns out a single Incident can have multiple Work Orders with the same ONSITE time (aaargh!).
As requested, here is a sample input, and what I want to get from it (apologies for the formatting). Note that while TICKETID and WONUM are primary keys, they are strings rather than integers. WORKLOGID is an integer.
Incident table:
TICKETID / Description / FieldX
1 / WORD1 / S
2 / WORD2 / P
3 / WORDX /
4 / / Q
Work order table:
WONUM / ORIGRECORDID / REPORTDATE
11 / 1 / 2015-01-01
12 / 2 / 2015-01-01
13 / 2 / 2015-02-04
14 / 3 / 2015-04-05
Worklog table:
WORKLOGID / RECORDKEY / MXRONSITE
101 / 11 / 2015-01-05
102 / 12 / 2015-01-04
103 / 12 /
104 / 12 / 2015-02-05
105 / 13 /
Output:
TICKETID / WONUM / WORKLOGID
1 / 11 / 101
2 / 12 / 102
3 / /
4 / /
(Worklog 101 linked to TICKETID 1, has non-null MXRONSITE, and is from work order 11)
(Worklogs 102-105 linked to TICKETID 2, of which 102 has lowest MXRONSITE, and is work order 12)
(No work logs associated with faults 103 or 104, so work order and worklog fields are null)
Post Christmas attack!
I have found a solution which works:
The method I found was to use multiple WITH queries, as follows:
WLMINL AS (
SELECT
RECORDKEY, MXRONSITE, MIN(WORKLOGID) AS "WORKLOG"
FROM MAXIMO.WORKLOG
WHERE WORKLOG.CLASS = 'WORKORDER'
GROUP BY RECORDKEY, MXRONSITE
),
WLMIND AS (
SELECT
RECORDKEY, MIN(MXRONSITE) AS "MXRONSITE"
FROM MAXIMO.WORKLOG
WHERE WORKLOG.CLASS = 'WORKORDER'
GROUP BY RECORDKEY
),
WLMIN AS (
SELECT
WLMIND.RECORDKEY AS "WONUM", WLMIND.MXRONSITE AS "ONSITE", WLMINL.WORKLOG AS "WORKLOGID"
FROM
WLMIND
INNER JOIN
WLMINL
ON
WLMIND.RECORDKEY = WLMINL.RECORDKEY AND WLMIND.MXRONSITE = WLMINL.MXRONSITE
)
Thus for each work order finding the first date, then for each work order and date finding the lowest worklogid, then joining the two tables. This is then repeated at a higher level to find the data by incident.
However this method does not work in a reasonable time, so while it may be suitable for smaller databases it's no good for the behemoths I'm working with.
I would do this with row_number function:
SQLFiddle
select ticketid, case when worklogid is not null then reportdate end d1, mxronsite d2
from (
select i.ticketid, wo.reportdate, wl.mxronsite, wo.wonum, wl.worklogid,
row_number() over (partition by i.ticketid
order by wl.mxronsite, wo.reportdate) rn
from incident i
left join workorder wo on wo.origrecordid = i.ticketid
and wo.origrecordclass = 'INCIDENT'
left join worklog wl on wl.recordkey = wo.wonum )
where rn = 1 order by ticketid
When you nest subqueries, you cannot access columns that belong two or more levels higher; in your statement, WL1 is not accessible in the innermost subquery. (There is also a group-by clause missing, btw)
This might work (not exactly sure what output you expect, but try it):
SELECT
WL1.MXRONSITE as "Date_First_Onsite",
WOL1.REPORTDATE as "Date_First_Onsite_Notified"
FROM Maximo.Incident
LEFT JOIN (
Maximo.WorkOrder WOL1
LEFT JOIN Maximo.Worklog WL1
ON WL1.RECORDKEY = WOL1.WONUM
) ON WOL1.ORIGRECORDID = Incident.TICKETID
AND WOL1.ORIGRECORDCLASS = 'INCIDENT'
WHERE WL1.WORKLOGID =
( SELECT MIN(WL3.WORKLOGID)
FROM Maximo.WorkOrder WOL3
LEFT JOIN Maximo.Worklog WL3
ON WL3.RECORDKEY = WOL3.WONUM
WHERE WOL3.ORIGRECORDID = WOL1.ORIGRECORDID
AND WL3.MXRONSITE IS NOT NULL
)
OR WL1.WORKLOGID IS NULL AND NOT EXISTS
( SELECT MIN(WL4.WORKLOGID)
FROM Maximo.WorkOrder WOL4
LEFT JOIN Maximo.Worklog WL4
ON WL4.RECORDKEY = WOL4.WONUM
WHERE WOL4.ORIGRECORDID = WOL1.ORIGRECORDID
AND WL4.MXRONSITE IS NOT NULL )
I may not have the details right on what you're trying to do... if you have some sample input and desired output, that would be a big help.
That said, I think an analytic function would help a lot, not only in getting the output but in organizing the code. Here is an example of how the max analytic function in a subquery could be used.
Again, the details on the join may be off -- if you can furnish some sample input and output, I'll bet someone can get to where you're trying to go:
with wo as (
select
wonum, origrecordclass, origrecordid, reportdate,
max (reportdate) over (partition by origrecordid) as max_date
from Maximo.workorder
where origrecordclass = 'INCIDENT'
),
logs as (
select
worklogid, mxronsite, recordkey,
max (mxronsite) over (partition by recordkey) as max_mx
from Maximo.worklog
)
select
i.ticketid,
l.mxronsite as "Date_First_Onsite",
wo.reportdate as "Date_First_Onsite_Notified"
from
Maximo.incident i
left join wo on
wo.origrecordid = i.ticketid and
wo.reportdate = wo.max_date
left join logs l on
wo.wonum = l.recordkey and
l.mxronsite = l.max_mx
-- edit --
Based on your sample input and desired output, this appears to give the desired result. It does do somewhat of an explosion in the subquery, but hopefully the efficiency of the analytic functions will dampen that. They are typically much faster, compared to using group by:
with wo_logs as (
select
wo.wonum, wo.origrecordclass, wo.origrecordid, wo.reportdate,
l.worklogid, l.mxronsite, l.recordkey,
max (reportdate) over (partition by origrecordid) as max_date,
min (mxronsite) over (partition by recordkey) as min_mx
from
Maximo.workorder wo
left join Maximo.worklog l on wo.wonum = l.recordkey
where wo.origrecordclass = 'INCIDENT'
)
select
i.ticketid, wl.wonum, wl.worklogid,
wl.mxronsite as "Date_First_Onsite",
wl.reportdate as "Date_First_Onsite_Notified"
from
Maximo.incident i
left join wo_logs wl on
i.ticketid = wl.origrecordid and
wl.mxronsite = wl.min_mx
order by 1

How to do self join on min/max

I am new to sql queries.
Table is defined as
( symbol varchar,
high int,
low int,
today date,
Primary key (symbol, today)
)
I need to find for each symbol in a given date range, max(high) and min(low) and corresponding dates for max(high) and min(low).
Okay to get first max date and min date in given table.
In a given date range some dates may be missing. If start date is not present then next date should be used and if last date is not present then earlier available date should be used
Data is for one year and around 5000 symbols.
I tried something like this
SELECT a.symbol,
a.maxValue,
a.maxdate,
b.minValue,
b.mindate
FROM (
SELECT table1.symbol, max_a.maxValue, max_a.maxdate
FROM table1
INNER JOIN (
SELECT table1.symbol,
max(table1.high) AS maxValue,
table1.TODAY AS maxdate
FROM table1
GROUP BY table1.symbol
) AS max_a
ON max_a.symbol = table1.symbol
AND table1.today = max_a.maxdate
) AS a
INNER JOIN (
SELECT symbol,
min_b.minValue,
min_b.mindate
FROM table1
INNER JOIN (
SELECT symbol,
min(low) AS minValue,
table1.TODAY AS mindate
FROM table1
GROUP BY testnsebav.symbol
) AS min_b
ON min_b.symbol = table1.symbol
AND table1.today = min_b.mindate
) AS b
ON a.symbol = b.symbol
The first INNER query pre-qualifies for each symbol what the low and high values are within the date range provided. After that, it joins back to the original table again (for same date range criteria), but also adds the qualifier that EITHER the low OR the high matches the MIN() or MAX() from the PreQuery. If so, allows it in the result set.
Now, the result columns. Not knowing which version SQL you were using, I have the first 3 columns as the "Final" values. The following 3 columns after that come from the record that qualified by EITHER of the qualifiers. As stocks go up and down all the time, its possible for the high and/or low values to occur more than once within the same time period. This will include ALL those entries that qualify the MIN() / MAX() criteria.
select
PreQuery.Symbol,
PreQuery.LowForSymbol,
PreQuery.HighForSymbol,
tFinal.Today as DateOfMatch,
tFinal.Low as DateMatchLow,
tFinal.High as DateMatchHigh
from
( select
t1.symbol,
min( t1.low ) as LowForSymbol,
max( t1.high ) as HighForSymbol
from
table1 t1
where
t1.today between YourFromDateParameter and YourToDateParameter
group by
t1.symbol ) PreQuery
JOIN table1 tFinal
on PreQuery.Symbol = tFinal.Symbol
AND tFinal.today between YourFromDateParameter and YourToDateParameter
AND ( tFinal.Low = LowForSymbol
OR tFinal.High = HighForSymbol )

Remove duplicates (1 to many) or write a subquery that solves my problem

Referring to the diagram below the records table has unique Records. Each record is updated, via comments through an Update Table. When I join the two I get lots of duplicates.
How to remove duplicates? Group By does not work for me as I have more than 10 fields in select query and some of them are functions.
Write a sub query which pulls the last updates in the Update table for each record that is updated in a particular month. Joining with this sub query will solve my problem.
Thanks!
Edit
Table structure that is of interest is
create table Records(
recordID int,
90more_fields various
)
create table Updates(
update_id int,
record_id int,
comment text,
byUser varchar(25),
datecreate datetime
)
Here's one way.
SELECT * /*But list columns explicitly*/
FROM Orange o
CROSS APPLY (SELECT TOP 1 *
FROM Blue b
WHERE b.datecreate >= '20110901'
AND b.datecreate < '20111001'
AND o.RecordID = b.Record_ID2
ORDER BY b.datecreate DESC) b
Based on the limited information available...
WITH cteLastUpdate AS (
SELECT Record_ID2, UpdateDateTime,
ROW_NUMBER() OVER(PARTITION BY Record_ID2 ORDER BY UpdateDateTime DESC) AS RowNUM
FROM BlueTable
/* Add WHERE clause if needed to restrict date range */
)
SELECT *
FROM cteLastUpdate lu
INNER JOIN OrangeTable o
ON lu.Record_ID2 = o.RecordID
WHERE lu.RowNum = 1
Last updates per record and month:
SELECT *
FROM UPDATES outerUpd
WHERE exists
(
-- Magic part
SELECT 1
FROM UPDATES innerUpd
WHERE innerUpd.RecordId = outerUpd.RecordId
GROUP BY RecordId
, date_part('year', innerUpd.datecolumn)
, date_part('month', innerUpd.datecolumn)
HAVING max(innerUpd.datecolumn) = outerUpd.datecolumn
)
(Works on PostgreSQL, date_part is different in other RDBMS)