How can I find an associated table oldest record filtering by one of it's attributes? - sql

I have the model Subscription which has_many Versions.
A Version has a status, plan_id and authorized_at date.
Any changes made to a Subscription comes from a Version modifications updating it's parent Subscription.
The goal is to find each Subscription's Version with the oldest authorized_at date WHERE the versions.plan_id is the same as the subscriptions.plan_id (in other words I need the authorization date of the Version where the plan_id changed to the current Subscription's plan_id).
This is the query I've come up with. I'm getting an error in the aggregate function syntax:
syntax error at or near "MIN" LINE 3: MIN (authorized_at) from versions ^
query:
select subscriptions.id,
MIN (authorized_at) from versions
where versions.plan_id = subscriptions.plan_id
) as current_version
from subscriptions
join versions on subscriptions.id = versions.subscription_id
where versions.status = 'processed'
I also am not sure if I should be grouping the versions by plan_id and then picking from each group. I'm kind of lost.

You can use a lateral subquery which can best be described as a foreach loop in SQL. They are an extremly performant way to select columns from a single correlated record or even aggregates from a group of related records.
For each row in subscriptions the DB will select a single row from versions ordered by authorized_at:
SELECT "subscriptions".*,
"latest_version"."authorized_at" AS current_version,
"latest_version"."id" AS current_version_id -- could be very useful
FROM "subscriptions"
LATERAL
(
SELECT "versions"."authorized_at", "versions"."id"
FROM "versions"
WHERE "versions"."subscription_id" = "subscriptions"."id" -- lateral reference
AND "versions"."plan_id" = "subscriptions"."plan_id"
AND "versions"."status" = 'processed'
ORDER BY "versions"."authorized_at" ASC
LIMIT 1
) latest_version ON TRUE
Creating lateral joins in ActiveRecord can be done either with SQL strings or Arel:
class Subscription < ApplicationRecord
# Performs a lateral join and selects the
# authorized_at of the latest version
def self.with_current_version
lateral = Version.arel_table.then do |v|
v.project(
v[:authorized_at],
v[:id] # optional
).where(
v[:subscription_id].eq(arel_table[:id])
.and(v[:plan_id].eq(arel_table[:plan_id]) )
.and(v[:status].eq('processed'))
)
.order(v[:authorized_at].asc)
.take(1) # limit 1
.lateral('latest_version ON TRUE')
end
lv = Arel::Table.new(:latest_version) # just a table alias
select(
*where(nil).arel.projections, # selects everything previously selected
lv[:authorized_at].as("current_version"),
lv[:id].as("current_version_id") # optional
).joins(lateral.to_sql)
end
end
If you just want to select the id and current_version column you should consider using pluck instead of selecting database models that aren't properly hydrated.

You can use DISTINCT ON to filter out rows, and keep a single one per subscription -- the first one per group according to the ORDER BY clause.
For example:
select distinct on (s.id) s.id, v.authorized_at
from subscription s
join versions v on v.subscription_id = s.id and v.plan_id = s.plan_id
where v.status = 'processed'
order by s.id, v.authorized_at

Below code will give you the versions where Version's plan_id is equal to Subscription's plan_id.
#versions = Version.joins("LEFT JOIN subscriptions ON subscriptions.plan_id = versions.plan_id")
To filter records by Version's status
#versions = Version.joins("LEFT JOIN subscriptions ON subscriptions.plan_id = versions.plan_id").where(status: "processed")
To filter records by Version's status and order by authorized_at in ascending order.
#versions = Version.joins("LEFT JOIN subscriptions ON subscriptions.plan_id = versions.plan_id").where(status: "processed").order(:authorized_at)
To filter records by Version's status and order by authorized_at in decending order.
#versions = Version.joins("LEFT JOIN subscriptions ON subscriptions.plan_id = versions.plan_id").where(status: "processed").order(authorized_at: :desc)
Hope this works for you!

Related

SQL - Returning fields based on where clause then joining same table to return max value?

I have a table named Ticket Numbers, which (for this example) contain the columns:
Ticket_Number
Assigned_Group
Assigned_Group_Sequence_No
Reported_Date
Each ticket number could contain 4 rows, depending on how many times the ticket changed assigned groups. Some of these rows could contain an assigned group of "Desktop Support," but some may not. Here is an example:
Example of raw data
What I am trying to accomplish is to get the an output that contains any ticket numbers that contain 'Desktop Support', but also the assigned group of the max sequence number. Here is what I am trying to accomplish with SQL:
Queried Data
I'm trying to use SQL with the following query but have no clue what I'm doing wrong:
select ih.incident_number,ih.assigned_group, incident_history2.maxseq, incident_history2.assigned_group
from incident_history_public as ih
left join
(
select max(assigned_group_seq_no) maxseq, incident_number, assigned_group
from incident_history_public
group by incident_number, assigned_group
) incident_history2
on ih.incident_number = incident_history2.incident_number
and ih.assigned_group_seq_no = incident_history2.maxseq
where ih.ASSIGNED_GROUP LIKE '%DS%'
Does anyone know what I am doing wrong?
You might want to create a proper alias for incident_history. e.g.
from incident_history as incident_history1
and
on incident_history1.ticket_number = incident_history2.ticket_number
and incident_history1.assigned_group_seq_no = incident_history2.maxseq
In my humble opinion a first error could be that I don't see any column named "incident_history2.assigned_group".
I would try to use common table expression, to get only ticket number that contains "Desktop_support":
WITH desktop as (
SELECT distinct Ticket_Number
FROM incident_history
WHERE Assigned_Group = "Desktop Support"
),
Than an Inner Join of the result with your inner table to get ticket number and maxSeq, so in a second moment you can get also the "MAXGroup":
WITH tmp AS (
SELECT i2.Ticket_Number, i2.maxseq
FROM desktop D inner join
(SELECT Ticket_number, max(assigned_group_seq_no) as maxseq
FROM incident_history
GROUP BY ticket_number) as i2
ON D.Ticket_Number = i2.Ticket_Number
)
SELECT i.Ticket_Number, i.Assigned_Group as MAX_Group, T.maxseq, i.Reported_Date
FROM tmp T inner join incident_history i
ON T.Ticket_Number = i.Ticket_Number and i.assigned_group_seq_no = T.maxseq
I think there are several different method to resolve this question, but I really hope it's helpful for you!
For more information about Common Table Expression: https://www.essentialsql.com/introduction-common-table-expressions-ctes/

Previous Record With Cross Apply Syntax

I have a table called ArchiveActivityDetails which shows the history of a Customer Repair Order. 1 Repair Order will have many visits (ActivityID) with a Technician allocated depending on who is available for that planned visit.
The system automatically allocates the time that is required for a job but sometimes a job requires longer so we manually ammend jobs.
My initial query from the customer was to pull the manually ammended jobs (ie: jobs where PlannedDuration >=60 minutes) and shows the Technician linked to that manually ammended job.
This report works fine.
My most recent request from the customer is to now ADD a column showing WHO WAS THE PREVIOUS TECHNICIAN linked that the Repair Order.
My collegues suggested I do a Cross Apply going back to the ArchiveActivityDetails table and then show "Previous Tech" but I have not used Cross Apply before and I am struggling with the syntax and unable to get the results I want. In my Cross Apply I used LAG to work out the 'PrevTech' but when pulling it into my main report, I get NULL. So I assume I am not doing the Cross Apply correctly.
DECLARE #DateFrom as DATE = '2019-05-20'
DECLARE #DATETO AS DATE = '2019-07-23'
----------------------------------------------------------------------------------
SELECT
AAD.Date
,ASM.ASM
,A.ASM as PrevASM
,ASM.KDGID2
,R.ResourceName
,R.ID_ResourceID
,A.ServiceOrderNumber
,CONCAT(EN.TECHVORNAME, ' ' , EN.TECHNACHNAME) as TechName
,A.PrevTech
,EN.TechnicianID
,AAD.ID_ActivityID
,SO.ServiceOrderNumber
,AAD.VisitNumber
,AAD.PlannedDuration
,AAD.ActualDuration
,AAD.PlannedDuration-AAD.ActualDuration as DIFF
,DR.Original_Duration
FROM
[Easy].[ASMTrans] AS ASM
INNER JOIN
[FS_OTBE].[EngPayrollNumbers] AS EN
ON ASM.KDGID2 = EN.KDGID2
INNER JOIN
[OFSA].[ResourceID] AS R
ON EN.TechnicianID = Try_cast(R.ResourceName as int)
INNER JOIN
[OFSDA].[ArchiveActivityDetails] as [AAD]
ON R.[ID_ResourceID] = AAD.ID_ResourceID
INNER JOIN
[OFSA].[ServiceOrderNumber] SO
ON SO.ID_ServiceOrderNumber = AAD.ID_ServiceOrderNumber
LEFT JOIN
[OFSE].[DurationRevision] DR
on DR.ID_ActivityID = AAD.ID_ActivityID
CROSS APPLY
(
SELECT
AD.Date
,AD.ID_CountryCode
,AD.ID_Status
,Activity_TypeID
,AD.ID_ActivityID
,AD.ID_ResourceID
,SO.ServiceOrderNumber
,ASM.ASM
,LAG(EN.TECHVORNAME+ ' '+EN.TECHNACHNAME) OVER (ORDER BY SO.ServiceOrderNumber,AD.ID_ActivityID) as PrevTech
,AD.VisitNumber
,AD.ID_ServiceOrderNumber
,AD.PlannedDuration
,AD.ActualDuration
,ROW_NUMBER() OVER (PARTITION BY AD.ID_ServiceOrderNumber Order by AD.ID_ActivityID,AD.Date) as ROWNUM
FROM
[Easy].[ASMTrans] AS ASM
INNER JOIN
[FS_OTBE].[EngPayrollNumbers] AS EN
ON ASM.KDGID2 = EN.KDGID2
INNER JOIN
[OFSA].[ResourceID] AS R
ON EN.TechnicianID = Try_cast(R.ResourceName as int)
INNER JOIN
[OFSDA].[ArchiveActivityDetails] as [AD]
ON R.[ID_ResourceID] = AD.ID_ResourceID
INNER JOIN
[OFSA].[ServiceOrderNumber] SO
ON SO.ID_ServiceOrderNumber = AD.ID_ServiceOrderNumber
WHERE
AAD.ID_ActivityID = AD.ID_ActivityID
AND
AD.ID_CountryCode = AAD.ID_CountryCode
AND AD.ID_Status = AAD.ID_Status
AND AD.ID_ResourceID = AAD.ID_ResourceID
AND AD.Activity_TypeID = AAD.Activity_TypeID
AND AD.ID_ServiceOrderNumber = AAD.ID_ServiceOrderNumber
AND AD.Date >= '2019-05-01'
) as A
WHERE
ASM.KDGID2
IN (50008323,50008326,50008329,50008332,50008335,50008338,50008341,50008344,50008347,50008350,50008353,50008356,50008359,50008362,50008365)
AND AAD.ID_Status = 1
AND AAD.ID_CountryCode = 7
AND AAD.Activity_TypeID=91
AND
(
AAD.[Date] BETWEEN IIF(#DateFrom < '20190520','20190520',#DateFrom) AND IIF(#DateTo < '20190520','20190520',#DateTo))
AND AAD.ActualDuration > 11
AND
(
(DR.Original_Duration >= 60)
OR
(DR.ID_ActivityID IS NULL AND AAD.PlannedDuration >= 60))
I expect to see the previous Tech and previous Area Sales Manager for the job that was Manually Ammended.
Business Reason: Managers want to see who initially requested for the job to be Manually Ammended. The time requested is being over estimated which is wasting time. To plan better they need to see who requests extra time at a job and try to reduce the time.
I will attach the ArchiveActivityDetail table showing the history of a Repair Order as well as expected results.
Your query results in the cross apply will appear as a table in your query, so you can use top(1) and order by descending to get the first row ordered by what you want (it looks like ActivityId? maybe VisitNumber?).
Simplifying to get at the root of the issue, say you have just one table with ServiceOrderNumber, ID_Activity, ASM, and TECH. To get the previous row for activity 2414073 you would do this:
select top(1) ASM, TECH
from OFSDA.ArchiveActivityDetails as AD
where ID_ServiceOrderNumber = 2370634229 -- same ServiceOrderNumber
and ID_Activity < 2414073 -- previous activities
order by ID_Activity desc -- highest activity less than 2414073
Instead of cross apply, you probably want to use outer apply. This is the same but you will get a row in your main query for the first activity, it will just have nulls for values in your apply. If you want the first row omitted from your results because it doesn't have a previous row, go ahead and use cross apply.
You can just put the above query into the parenthesis in outer apply() and add an alias (Previous). You link to the values for the current row in your main query, use top(1) to get the first row only, and order by ID_Activity descending to get the row with the highest ID_Activity.
select ASM, TECH,
PreviousASM, PreviousTECH
from OFSDA.ArchiveActivityDetails as AD
outer apply (
select top(1) ADInner.ASM as PreviousASM, ADInner.TECH as PreviousTECH
from OFSDA.ArchiveActivityDetails as ADInner
where ADInner.ID_ServiceOrderNumber = AD.ID_ServiceOrderNumber
and ADInner.ID_Activity < AD.ID_Activity
order by ADInnerID_Activity desc
) Previous
where ID_ServiceOrderNumber = 2370634229

Select last unique polymorphic objects ordered by created at in Rails

I'm trying to get unique polymorphic objects by the value of one of the columns. I'm using Postgres.
The object has the following properties: id, available_type, available_id, value, created_at, updated_at.
I'm looking to get the most recent object per available_id (recency determined by created_at) for records with the available_type of "User".
I've been trying ActiveRecord queries like this:
Service.where(available_type: "User").order(created_at: :desc).distinct(:available_id)
But it isn't limiting to one per available_id.
Try
Service.where(id: Service
.where(available_type: "User")
.group(:available_id)
.maximum(:id).values)
Using a left join is going to be your probably most efficient way
The following sql selects only rows where there are no rows with a larger created_at.
See this post for more info: https://stackoverflow.com/a/27802817/5301717
query = <<-SQL
SELECT m.* # get the row that contains the max value
FROM services m # "m" from "max"
LEFT JOIN services b # "b" from "bigger"
ON m.available_id = b.available_id # match "max" row with "bigger" row by `home`
AND m.available_type = b.available_type
AND m.created_at < b.created_at # want "bigger" than "max"
WHERE b.created_at IS NULL # keep only if there is no bigger than max
AND service.available_type = 'User'
SQL
Service.find_by_sql(query)
distinct doesn't take a column name as an argument, only true/false.
distinct is for returning only distinct records and has nothing to do with filtering for a specific value.
if you need a specific available_id, you need to use where
e.g.
Service.distinct.where(available_type: "User").where(available_id: YOUR_ID_HERE).order(created_at: :desc)
to only get the most recent add limit
Service.distinct.where(available_type: "User").where(available_id: YOUR_ID_HERE).order(created_at: :desc).limit(1)
if you need to get the most recent of each distinct available_id, that will require a loop
first get the distinct polymorphic values by only selecting the columns that need to be distinct with select:
available_ids = Service.distinct.select(:available_id).where(available_type: 'User')
then get the most recent of each id:
recents = []
available_ids.each do |id|
recents << Service.where(available_id: id).where(available_type: 'User').order(created_at: :desc).limit(1)
end

Getting count of latest items from secondary view

I've got a problem constructing a somewhat advanced query.
I have two views - A and B where B is the child of A.
This relationship is handled by
A vw_StartDate.MapToID
=
B vw_TrackerFeaturesBasic.StartDateMapToID.
What I need to do is grab every parent A and a count of the LATEST added children B.
This is a query that gets the latest children B in a SSRS-report: (This does not use A at all!):
/****** Selecting the incomplete, applicable issues of the latest insert. ******/
SELECT DISTINCT [TRK_Feature_LKID]
,[TrackerGroupDescription]
,[ApplicableToProject]
,[ReadyForWork]
,[DateStamp]
FROM [vw_TRK_TrackerFeaturesBasic] as temp
WHERE ApplicableToProject = 1
AND DateStamp = (SELECT MAX(DateStamp) from [vw_TRK_TrackerFeaturesBasic] WHERE [TRK_StartDateID] = #WSCTrackerID AND StartDateMapToID = #HierarchyID AND [TRK_Feature_LKID] = temp.TRK_Feature_LKID )
ORDER BY DateStamp DESC
I've tried a few different ways, but I can't figure out how to get the latest added children from the subquery (I've mainly used a subquery nestled in a COUNT / Case + SUM).
Since SQL Server doesn't really allow us to use aggregate functions in aggregate functions I'm not sure how to get the latest added item in a subquery as the subquery most likely has to be nested in a COUNT or something similar.
Below is a version I'm working on (doesn't work):
Column 'vw_TRK_TrackerFeaturesBasic.StartDateMapToID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
SELECT b.TRK_StartDateID
,(SELECT COUNT(b.TRK_Feature_LKID) FROM b )
FROM vw_TRK_StartDate as a
LEFT JOIN vw_TRK_TrackerFeaturesBasic as b
ON b.StartDateMapToID = a.MapToID AND b.DateStamp = (SELECT MAX(DateStamp) FROM [vw_TRK_TrackerFeaturesBasic] WHERE [TRK_StartDateID] = 47 AND [StartDateMapToID] = 13)
WHERE MapToId = 13
--(SELECT MAX(DateStamp) from [vw_TRK_TrackerFeaturesBasic] WHERE [TRK_StartDateID] = #WSCTrackerID AND StartDateMapToID = #HierarchyID AND [TRK_Feature_LKID] = temp.TRK_Feature_LKID
GROUP BY b.TRK_StartDateID
Your question is a bit hard to follow, because I don't see a relationship between your queries and this request:
What I need to do is grab every parent A and a count of the LATEST
added children B.
Focusing on this statement, you can do this readily with window functions:
SELECT b.StartDateMapToID, COUNT(*)
FROM (SELECT tfb.*,
MAX(tfb.DateStamp) OVER (PARTITION BY tfb.StartDateMapToID) as max_DateStamp
FROM vw_TRK_TrackerFeaturesBasic tfb
) tfb
GROUP BY b.StartDateMapToID;

Most efficient way to perform an SQL-Statement containing a max-aggregate in the sub-query

I have a currency table and an exchange_rate_log table. The latter contains over a billion of records. The exchange_rate_log table contains the traded exchange rates for many currencies for the last couple of years.
Now I have to select for all available currencies (in the table currency) the latest valid traded exchange rate for a given exchange_currency and a given date.
So if the given exchange_currency would be "EUR" and the date would be yesterday. The result would return the latest trades of all available currencies into "EUR" in the time window from the first available entries in the table "exchange_rate_log" until yesterday.
The following query shows a possible way to get the answer. However the given query does not perform very well.
SELECT cur.name, log.price, log.valid_at
FROM currency cur
JOIN exchange_rate_log log ON (cur.id = log.currency_id)
WHERE log.valid_at = (SELECT max(log2.valid_at)
FROM exchange_rate_log log2
WHERE log2.currency_id = cur.id
AND log2.exchange_currency = ?
AND log2.valid_at < ?);
Is there a possibility to get the same result with an adapted query which would perform better? Is it possible to create an index to boost the performance of the above query?
Remark: The target dbms is Oracle.
You didn't tag which DBMS you're using, so this is using RANK from Standard SQL:
select *
from
(
SELECT cur.name, log.price, log.valid_at,
RANK()
OVER (PARTITION BY cur.id
order by valid_at DESC) as rnk
FROM currency cur
JOIN exchange_rate_log log on (cur.id = log.currency_id)
WHERE log.exchange_currency = ?
AND log.valid_at < ?
) dt
where rnk = 1;
If this is more efficient depends on the optimizing capabilities of the DBMS.
Otherwise adding a log.valid_at < ? condition to your original query might help.
SELECT TOP 1
cur.name, log.price, log.valid_at
FROM exchange_rate_log log
INNER JOIN currency cur
ON log.currency_id = cur.id
WHERE log.exchange_currency = ?
AND log.valid_at < ?
ORDER BY log.valid DESC;
It is also very important to have an index on log.valid and log.exchange_currency or else nothing is going to make your query fast.
I also think that the performance of the query presented will be similar but I think that it is slightly simplified.