Select query with max date - sql

I have this query
SQL query: selecting by branch and machine code, order by branch and date
SELECT
mb.machine_id AS 'MachineId',
MAX(mb.date) AS 'Date',
mi.branch_id AS 'BranchId',
b.branch AS 'Branch',
b.branch_code AS 'BranchCode'
FROM
dbo.machine_beat mb
LEFT JOIN dbo.machine_ids mi
ON mb.machine_id = mi.machine_id
LEFT JOIN dbo.branches b
ON mi.branch_id = b.lookup_key
GROUP BY
mb.machine_id,
mi.branch_id,
b.branch,
b.branch_code
ORDER BY
b.branch, [Date] DESC
Query result:
|==========|=======================|=========|==========|==========|
|MachineId |Date |BranchId |Branch |BranchCode|
|==========|=======================|=========|==========|==========|
|SS10000005|2014-03-31 19:10:17.110|3 |Mamamama |MMMM |
|SS10000043|2014-03-31 17:16:32.760|3 |Mamamama |MMMM |
|SS10000005|2014-02-17 14:58:42.523|3 |Mamamama |MMMM |
|==================================================================|
My problem is how to select the updated machine code? Expected query result:
|==========|=======================|=========|==========|==========|
|MachineId |Date |BranchId |Branch |BranchCode|
|==========|=======================|=========|==========|==========|
|SS10000005|2014-03-31 19:10:17.110|3 |Mamamama |MMMM |
|==================================================================|
Update
I created sqlfiddle. I also added data, aside from MMMM. I need the updated date for each branch. So probably, my result will be:
|==========|=======================|=========|==========|==========|
|MachineId |Date |BranchId |Branch |BranchCode|
|==========|=======================|=========|==========|==========|
|SS10000343|2014-06-03 13:43:40.570|1 |Cacacaca |CCCC |
|SS30000033|2014-03-31 18:59:42.153|8 |Fafafafa |FFFF |
|SS10000005|2014-03-31 19:10:17.110|3 |Mamamama |MMMM |
|==================================================================|

Try using Row_number with partition by
select * from
(
SELECT
mb.machine_id AS 'MachineId',
mb.date AS 'Date',
mi.branch_id AS 'BranchId',
b.branch AS 'Branch',
b.branch_code AS 'BranchCode',rn=row_number()over(partition by mb.machine_id order by mb.date desc)
FROM
dbo.machine_beat mb
LEFT JOIN dbo.machine_ids mi
ON mb.machine_id = mi.machine_id
LEFT JOIN dbo.branches b
ON mi.branch_id = b.lookup_key
WHERE
branch_code = 'MMMM'
/*
GROUP BY
mb.machine_id,
mi.branch_id,
b.branch,
b.branch_code
*/
)x
where x.rn=1

#861051069712110711711710997114 is looking in the right direction - this is a greatest-n-per-group question. Yours is more complicated than the usual because the greatest portion is coming from a different table than the group portion. The only issue with his answer is that you hadn't provided sufficient information to finish it correctly.
The following solves the problem:
WITH Most_Recent_Beat AS (SELECT Machine.branch_id,
Beat.machine_id, Beat.date,
ROW_NUMBER() OVER(PARTITION BY Machine.branch_id
ORDER BY Beat.date DESC) AS rn
FROM machine_id Machine
JOIN machine_beat Beat
ON Beat.machine_id = Machine.machine_id)
SELECT Beat.machine_id, Beat.date,
Branches.lookup_key, Branches.branch, Branches.branch_code
FROM Branches
JOIN Most_Recent_Beat Beat
ON Beat.branch_id = Branches.lookup_key
AND Beat.rn = 1
ORDER BY Branches.branch, Beat.date DESC
(and corrected SQL Fiddle for testing. You shouldn't be using a different RDBMS for the example, especially as there were syntax errors for the db you say you're using.)
Which yields your expected results.
So what's going on here? The key is the ROW_NUMBER()-function line. This function itself simply generates a number series. The OVER(...) clause defines what's known as a window, over which the function will be run. PARTITION BY is akin to GROUP BY - every time a new group occurs (new Machine.branch_id value), the function restarts. The ORDER BY inside the parenthesis simply says that, per group, entries should have the given function run on entries in that order. So, the greatest date (most recent, assuming all dates are in the past) gets 1, the next 2, etc.
This is done in a CTE here (it could also be done as part of a subquery table-reference) because only the most recent date is required - where the generated row number is 1; as SQL Server doesn't allow you to put SELECT-clause aliases into the WHERE clause, it needs to be wrapped in another level to be able to reference it that way.

Related

How to run sql queries with multiple with clauses(sub-query refactoring)?

I have a code block that has 7/8 with clauses(
sub-query refactoring) in queries. I'm looking on how to run this query as I'm getting 'sql compilation errors' when running these!, While I'm trying to run them I'm getting errors in snowflake. for eg:
with valid_Cars_Stock as (
select car_id
from vw_standard.agile_car_issue_dime
where car_stock_expiration_ts is null
and car_type_name in ('hatchback')
and car_id = 1102423975
)
, car_sale_hist as (
select vw.issue_id, vw.delivery_effective_ts, bm.car_id,
lag(bm.sprint_id) over (partition by vw.issue_id order by vw.delivery_effective_ts) as previous_stock_id
from valid_Cars_Stock i
join vw_standard.agile_car_fact vw on vw.car_id = bm.car_id
left join vw_standard.agile_board_stock_bridge b on b.board_stock_bridge_dim_key = vw.issue_board_sprint_bridge_dim_key
order by vw.car_stock_expiration_ts desc
)
,
So how to run this 2 queries separately or together! I'm new to sql aswell any help would be ideal
So lets just reformate that code as it stands:
with valid_Cars_Stock as (
select
car_id
from vw_standard.agile_car_issue_dime
where car_stock_expiration_ts is null
and car_type_name in ('hatchback')
and car_id = 1102423975
), car_sale_hist as (
select
vw.issue_id,
vw.delivery_effective_ts,
bm.car_id,
lag(bm.sprint_id) over (partition by vw.issue_id order by vw.delivery_effective_ts) as previous_stock_id
from valid_Cars_Stock i
join vw_standard.agile_car_fact vw
on vw.car_id = bm.car_id
left join vw_standard.agile_board_stock_bridge b
on b.board_stock_bridge_dim_key = vw.issue_board_sprint_bridge_dim_key
order by vw.car_stock_expiration_ts desc
),
There are clearly part of a larger block of code.
For an aside of CTE, you should 100% ignore anything anyone (including me) says about them. They are 2 things, a statical sugar, and they allow avoidance of repetition, thus the Common Table Expression name. Anyways, they CAN perform better than temp tables, AND they CAN perform worse than just repeating the say SQL many times in the same block. There is no one rule. Testing is the only way to find for you SQL what is "fastest" and it can and does change as updates/releases are made. So ignoring performance comments.
if I am trying to run a chain like this to debug it I alter the point I would like to stop normally like so:
with valid_Cars_Stock as (
select
car_id
from vw_standard.agile_car_issue_dime
where car_stock_expiration_ts is null
and car_type_name in ('hatchback')
and car_id = 1102423975
)--, car_sale_hist as (
select
vw.issue_id,
vw.delivery_effective_ts,
bm.car_id,
lag(bm.sprint_id) over (partition by vw.issue_id order by vw.delivery_effective_ts) as previous_stock_id
from valid_Cars_Stock i
join vw_standard.agile_car_fact vw
on vw.car_id = bm.car_id
left join vw_standard.agile_board_stock_bridge b
on b.board_stock_bridge_dim_key = vw.issue_board_sprint_bridge_dim_key
order by vw.car_stock_expiration_ts desc
;), NEXT_AWESOME_CTE_THAT_TOTALLY_MAKES_SENSE (
-- .....
and now the result of car_sale_hist will be returned. because we "completed" the CTE chain by not "starting another" and the ; stopped the this is all part of my SQL block.
Then once you have that steps working nicely, remove the semi-colon and end of line comments, and get of with value real value.

Calculated column syntax when using a group by function Teradata

I'm trying to include a column calculated as a % of OTYPE.
IE
Order type | Status | volume of orders at each status | % of all orders at this status
SELECT
T.OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
(STATVOL / COUNT(ROW_ID)) * 100
FROM Database.S_ORDER O
LEFT JOIN /* Finding definitions for status codes & attaching */
(
SELECT
ROW_ID AS TYPEJOIN,
"NAME" AS OTYPE
FROM database.S_ORDER_TYPE
) T
ON T.TYPEJOIN = ORDER_TYPE_ID
GROUP BY (T.OTYPE, STATUS_CD)
/*Excludes pending and pending online orders */
WHERE CAST(CREATED AS DATE) = '2018/09/21' AND STATUS_CD <> 'Pending'
AND STATUS_CD <> 'Pending-Online'
ORDER BY T.OTYPE, STATUS_CD DESC
OTYPE STATUS_CD STATVOL TOTALPERC
Add New Service Provisioning 2,740 100
Add New Service In-transit 13 100
Add New Service Error - Provisioning 568 100
Add New Service Error - Integration 1 100
Add New Service Complete 14,387 100
Current output just puts 100 at every line, need it to be a % of total orders
Could anyone help out a Teradata & SQL student?
The complication making this difficult is my understanding of the group by and count syntax is tenuous. It took some fiddling to get it displayed as I have it, I'm not sure how to introduce a calculated column within this combo.
Thanks in advance
There are a couple of places the total could be done, but this is the way I would do it. I also cleaned up your other sub query which was not required, and changed the date to a non-ambiguous format (change it back if it cases an issue in Teradata)
SELECT
T."NAME" as OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
COUNT(STATUS_CD)*100/TotalVol as Pct
FROM database.S_ORDER O
LEFT JOIN EDWPRDR_VW40_SBLCPY.S_ORDER_TYPE T on T.ROW_ID = ORDER_TYPE_ID
cross join (select count(*) as TotalVol from database.S_ORDER) Tot
GROUP BY T."NAME", STATUS_CD, TotalVol
WHERE CAST(CREATED AS DATE) = '2018-09-21' AND STATUS_CD <> 'Pending' AND STATUS_CD <> 'Pending-Online'
ORDER BY T."NAME", STATUS_CD DESC
A where clause comes before a group by clause, so the query
shown in the question isn't valid.
Always prefix every column reference with the relevant table alias, below I have assumed that where you did not use the alias that it belongs to the orders table.
You probably do not need a subquery for this left join. While there are times when a subquery is needed or good for performance, this does not appear to be the case here.
Most modern SQL compliant databases provide "window functions", and Teradata does do this. They are extremely useful, and here when you combine count() with an over clause you can get the total of all rows without needing another subquery or join.
Because there is neither sample data nor expected result provided with the question I do not actually know which numbers you really need for your percentage calculation. Instead I have opted to show you different ways to count so that you can choose the right ones. I suspect you are getting 100 for each row because the count(status_cd) is equal to the count(row_id). You need to count status_cd differently to how you count row_id. nb: The count() function increases by 1 for every non-null value
I changed the way your date filter is applied. It is not efficient to change data on every row to suit constants in a where clause. Leave the data untouched and alter the way you apply the filter to suit the data, this is almost always more efficient (search sargable)
SELECT
t.OTYPE
, o.STATUS_CD
, COUNT(o.STATUS_CD) count_status
, COUNT(t.ROW_ID count_row_id
, count(t.row_id) over() count_row_id_over
FROM dbo.S_ORDER o
LEFT JOIN dbo.S_ORDER_TYPE t ON t.TYPEJOIN = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
WHERE o.CREATED >= '2018-09-21' AND o.CREATED < '2018-09-22'
AND o.STATUS_CD <> 'Pending'
AND o.STATUS_CD <> 'Pending-Online'
GROUP BY
t.OTYPE
, o.STATUS_CD
ORDER BY
t.OTYPE
, o.STATUS_CD DESC
As #TomC already noted, there's no need for the join to a Derived Table. The simplest way to get the percentage is based on a Group Sum. I also changed the date to an Standard SQL Date Literal and moved the where before group by.
SELECT
t."NAME",
o.STATUS_CD,
Count(o.STATUS_CD) AS STATVOL,
-- rule of thumb: multiply first then divide, otherwise you will get unexpected results
-- (Teradata rounds after each calculation)
100.00 * STATVOL / Sum(STATVOL) Over ()
FROM database.S_ORDER AS O
/* Finding definitions for status codes & attaching */
LEFT JOIN database.S_ORDER_TYPE AS t
ON t.ROW_ID = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
-- if o.CREATED is a Timestamp there's no need to apply the CAST
WHERE Cast(o.CREATED AS DATE) = DATE '2018-09-21'
AND o.STATUS_CD NOT IN ('Pending', 'Pending-Online')
GROUP BY (T.OTYPE, o.STATUS_CD)
ORDER BY T.OTYPE, o.STATUS_CD DESC
Btw, you probably don't need an Outer Join, Inner should return the same result.

How to Avoid Duplicate ID's In Access SQL

I have a problem that I hope you can help me.
I have the next query on Access SQL.
SELECT
ID_PLAN_ACCION, ID_SEGUIMIENTO,
Max(FECHA_SEGUIMIENTO) AS MAX_FECHA
FROM
SEGUIMIENTOS
WHERE
(((ID_PLAN_ACCION) = [CODPA]))
GROUP BY
ID_PLAN_ACCION, ID_SEGUIMIENTO;
And it returns this information:
ID_PLAN_ACCION ID_SEGUIMIENTO MAX_FECHA
-----------------------------------------------
A1-01 1 16/01/2014
A1-01 2 30/01/2014
But I really need that it throws off this:
ID_PLAN_ACCION ID_SEGUIMIENTO MAX_FECHA
----------------------------------------------
A1-01 2 30/01/2014
As you can see I only need the record that has the most recently date, not all the records
The GROUP BY doesn't work.
Please can you tell me what can I do? I am new on all this.
Thank you so much!!
PD: Sorry for my english, I'm learning
This will produce the results you want:
SELECT ID_PLAN_ACCION, max(ID_SEGUIMIENTO) as ID_SEGUIMIENTO, Max(FECHA_SEGUIMIENTO) AS MAX_FECHA
FROM SEGUIMIENTOS
WHERE ID_PLAN_ACCION = [CODPA]
GROUP BY ID_PLAN_ACCION;
I removed id_sequiimiento from the group by and added an aggregation function to get the max value. If the ids increase along with the date, this will work.
Another way to approach this query, though, is to use top and order by:
SELECT top 1 ID_PLAN_ACCION, ID_SEGUIMIENTO, FECHA_SEGUIMIENTO
FROM SEGUIMIENTOS
WHERE ID_PLAN_ACCION = [CODPA]
ORDER BY FECHA_SEGUIMIENTO desc;
This works because you are only returning one row.
EDIT:
If you have more codes, that you are looking at, you need a more complicated query. Here is an approach using where/not exists:
SELECT ID_PLAN_ACCION, ID_SEGUIMIENTO, FECHA_SEGUIMIENTO
FROM SEGUIMIENTOS s
WHERE not exists (select 1
from SEGUIMIENTOS s2
where s.ID_PLAN_ACCION = s2.ID_PLAN_ACCION and
s2.FECHA_SEGUIMIENTO > s.FECHA_SEGUIMIENTO
)
ORDER BY FECHA_SEGUIMIENTO desc;
You can read this as: "Get me all rows from SEGUIMIENTOS where there is no other row with the same ID_PLAN_ACCION that has a larger date". Is is another way of saying that the original row has the maximum date.

How to combine this query

In the query
cr is customers,
chh? ise customer_pays,
cari_kod is customer code,
cari_unvan1 is customer name
cha_tarihi is date of pay,
cha_meblag is pay amount
The purpose of query, the get the specisified list of customers and their last date for pay and amount of money...
Actually my manager needs more details but the query is very slow and that is why im using only 3 subquery.
The question is how to combine them ?
I have researched about Cte and "with clause" and "subquery in "where " but without luck.
Can anybody have a proposal.
Operating system is win2003 and sql server version is mssql 2005.
Regards
select cr.cari_kod,cr.cari_unvan1, cr.cari_temsilci_kodu,
(select top 1
chh1.cha_tarihi
from dbo.CARI_HESAP_HAREKETLERI chh1 where chh1.cha_kod=cr.cari_kod order by chh1.cha_RECno) as sontar,
(select top 1
chh2.cha_meblag
from dbo.CARI_HESAP_HAREKETLERI chh2 where chh2.cha_kod=cr.cari_kod order by chh2.cha_RECno) as sontutar
from dbo.CARI_HESAPLAR cr
where (select top 1
chh3.cha_tarihi
from dbo.CARI_HESAP_HAREKETLERI chh3 where chh3.cha_kod=cr.cari_kod order by chh3.cha_RECno) >'20130314'
and
cr.cari_bolge_kodu='322'
or
cr.cari_bolge_kodu='324'
order by cr.cari_kod
You will probably speed up the query by changing your last where clause to:
where (select top 1 chh3.cha_tarihi
from dbo.CARI_HESAP_HAREKETLERI chh3 where chh3.cha_kod=cr.cari_kod
order by chh3.cha_RECno
) >'20130314' and
cr.cari_bolge_kodu in ('322', '324')
order by cr.cari_kod
Assuming that you want both the date condition met and one of the two codes. Your original logic is the (date and code = 322) OR (code = 324).
The overall query can be improved by finding the record in the chh table and then just using that. For this, you want to use the window function row_number(). I think this is the query that you want:
select cari_kod, cari_unvan1, cari_temsilci_kodu,
cha_tarihi, cha_meblag
from (select cr.*, chh.*,
ROW_NUMBER() over (partition by chh.cha_kod order by chh.cha_recno) as seqnum
from dbo.CARI_HESAPLAR cr join
dbo.CARI_HESAP_HAREKETLERI chh
on chh.cha_kod=cr.cari_kod
where cr.cari_bolge_kodu in ('322', '324')
) t
where chh3.cha_tarihi > '20130314' and seqnum = 1
order by cr.cari_kod;
This version assumes the revised logic date/code logic.
The inner subquery select might generate an error if there are two columns with the same name in both tables. If so, then just list the columns instead of using *.

Joining to a limited subquery?

I have this releases table in a SQLite3 database, listing each released version of an application:
|release_id|release_date|app_id|
|==========|============|======|
| 1001| 2009-01-01 | 1|
| 1003| 2009-01-01 | 1|
| 1004| 2009-02-02 | 2|
| 1005| 2009-01-15 | 1|
So for each app_id, there will be multiple rows. I have another table, apps:
|app_id|name |
|======|========|
| 1|Everest |
| 2|Fuji |
I want to display the name of the application and the newest release, where "newest" means (a) newest release_date, and if there are duplicates, (b) highest release_id.
I can do this for an individual application:
SELECT apps.name,releases.release_id,releases.release_date
FROM apps
INNER JOIN releases
ON apps.app_id = releases.app_id
WHERE releases.release_id = 1003
ORDER BY releases.release_date,releases.release_id
LIMIT 1
but of course that ORDER BY applies to the whole SELECT query, and if I leave out the WHERE clause, it still returns only one row.
It's a one-shot query on a small database, so slow queries, temp tables, etc. are fine - I just can't get my brain around the SQL way to do this.
This is easy to do with the analytic function ROW_NUMBER(), which I guess sqlite3 doesn't support. But you can do it in a way that's a bit more flexible than what's given in the previous answers:
SELECT
apps.name,
releases.release_id,
releases.release_date
FROM apps INNER JOIN releases
ON apps.app_id = releases.app_id
WHERE NOT EXISTS (
-- // where there doesn't exist a more recent release for the same app
SELECT * FROM releases AS R
WHERE R.app_id = apps.app_id
AND R.release_data > releases.release_data
)
For example, if you had multiple ordering columns that define "latest," MAX wouldn't work for you, but you could modify the EXISTS subquery to capture the more complicated meaning of "latest."
This is the "greatest N per group" problem. It comes up several times per week on StackOverflow.
I usually use a solution like the one in #Steve Kass' answer, but I do it without subqueries (I got into the habit years ago with MySQL 4.0, which didn't support subqueries):
SELECT a.name, r1.release_id, r1.release_date
FROM apps a
INNER JOIN releases r1
LEFT OUTER JOIN releases r2 ON (r1.app_id = r2.app_id
AND (r1.release_date < r2.release_date
OR r1.release_date = r2.release_date AND r1.release_id < r2.release_id))
WHERE r2.release_id IS NULL;
Internally, this probably optimizes identically to the NOT EXISTS syntax. You can analyze the query with EXPLAIN to make sure.
Re your comment, you could just skip the test for release_date because release_id is just as useful for establishing the chronological order of releases, and I assume it's guaranteed to be unique, so this simplifies the query:
SELECT a.name, r1.release_id, r1.release_date
FROM apps a
INNER JOIN releases r1
LEFT OUTER JOIN releases r2 ON (r1.app_id = r2.app_id
AND r1.release_id < r2.release_id)
WHERE r2.release_id IS NULL;
It's ugly, but I think it'll work
select apps.name, (select releases.release_id from releases where releases.app_id=apps.app_id order by releases.release_date, releases.release_id), (select releases.release_date from releases where releases.app_id=apps.app_id order by releases.release_date, releases.release_id) from apps order by apps.app_id
I hope there's some way to get both of those columns in one embedded select, but I don't know it.
Try:
SELECT a.name,
t.max_release_id,
t.max_date
FROM APPS a
JOIN (SELECT t.app_id,
MAX(t.release_id) 'max_release_id',
t.max_date
FROM (SELECT r.app_id,
r.release_id,
MAX(r.release_date) 'max_date'
FROM RELEASES r
GROUP BY r.app_id, r.release_id)
GROUP BY t.app_id, t.max_date) t
Err second attempt. Assuming that IDs are monotonically increasing and overflow is not a likely occurance, you can ignore the date and just do:
SELECT apps.name, releases.release_id, releases.release_date
FROM apps INNER JOIN releases on apps.app_id = releases.app_id
WHERE releases.release_id IN
(SELECT Max(release_id) FROM releases
GROUP BY app_id);