Group by not working to get count of a column with other max record in sql - sql

I have a table named PublishedData, see image below
I'm trying to get the output like, below image

I think you can use a query like this:
SELECT dt.DistrictName, ISNULL(dt.Content, 'N/A') Content, dt.UpdatedDate, mt.LastPublished, mt.Unpublished
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY DistrictName ORDER BY UpdatedDate DESC, ISNULL(Content, 'zzzzz')) seq
FROM PublishedData) dt
INNER JOIN (
SELECT DistrictName, MAX(LastPublished) LastPublished, COUNT(CASE WHEN IsPublished = 0 THEN 1 END) Unpublished
FROM PublishedData
GROUP BY DistrictName) mt
ON dt.DistrictName = mt.DistrictName
WHERE
dt.seq = 1;
Because I think you use an order over UpdatedDate, Content to gain you two first columns.

Check out something like this (I don't have your tables, but you will get the idea where to follow with your query):
SELECT DirectName,
MAX(UpdatedDate),
MAX(LastPublished),
(
SELECT COUNT(*)
FROM PublishedData inr
WHERE inr.DirectName = outr.DirectName
AND inr.IsPublished = 0
) AS Unpublished
FROM PublishedData outr
GROUP BY DirectName

We should required a unique identity for that required output in PublishedData Table,Because We can't get the Latest content from given Schema.
If you want data apart from content like DistictName,updatedDate,LastPublishedDate and count of Unpublished records ,Please use Query given below :
select T1.DistrictName,T1.UpdatedDate,T1.LastPublished,T2.Unpublished from
(select DistrictName,Max(UpdateDate) as UpdatedDate,Max(LastPublished) as LastPublished from PublishedData group by DistrictName) T1
inner join
(select DistrictName,count(IsPublished) as Unpublished from PublishedData where isPublished=0 group by DistrictName) T2 ON T1.DistrictName=T2.DistrictName ORDER BY T2.Unpublished DESC

Related

SQL query to return duplicate rows for certain column, but with unique values for another column

I have written the query shown here that combines three tables and returns rows where the at_ticket_num from appeal_tickets is duplicated but against a different at_sys_ref value
select top 100
t.t_reference, at.at_system_ref, at_ticket_num, a.a_case_ref
from
tickets t, appeal_tickets at, appeals_2 a
where
t.t_reference in ('AB123','AB234') -- filtering on these values so that I can see that its working
and t.t_number = at.at_ticket_num
and at.at_system_ref = a.a_system_ref
and at.at_ticket_num IN (select at_ticket_num
from appeal_tickets
group by at_ticket_num
having count(distinct at_system_ref) > 1)
order by
t.t_reference desc
This is the output:
t_reference at_system_ref at_ticket_num a_case_ref
-------------------------------------------------------
AB123 30838974 23641583 1111979010
AB123 30838976 23641583 1111979010
AB234 30839149 23641520 1111977352
AB234 30839209 23641520 1111988003
I want to modify this so that it only returns records where t_reference is duplicated but against a different a_case_ref. So in above case only records for AB234 would be returned.
Any help would be much appreciated.
You want all ticket appeals that have more than one system reference and more than one case reference it seems. You can join the tables, count the occurrences per ticket and then only keep the tickets that match these criteria.
select *
from
(
select
t.t_reference, at.at_system_ref, at.at_ticket_num, a.a_case_ref,
count(distinct a.a_system_ref) over (partition by at.at_ticket_num) as sysrefs,
count(distinct a.a_case_ref) over (partition by at.at_ticket_num) as caserefs
from tickets t
join appeal_tickets at on at.at_ticket_num = t.t_number
join appeals_2 a on a.a_system_ref = at.at_system_ref
) counted
where sysrefs > 1 and caserefs > 1
order by t.t_reference, at.at_system_ref, at.at_ticket_num, a.a_case_ref;
Correction
It seems that SQL Server still doesn't support COUNT(DISTINCT ...) OVER (...). You can count distinct values in a subquery though. Replace
count(distinct a.a_system_ref) over (partition by at.at_ticket_num) as sysrefs,
by
(
select count(distinct a2.a_system_ref)
from appeal_tickets at2
join appeals_2 a2 on a2.a_system_ref = at2.at_system_ref
where at2.at_ticket_num = t.t_number
) as sysrefs,
An alternative workaround is to use DENSE_RANK in two directions (found here: https://stackoverflow.com/a/53518204/2270762):
dense_rank() over (partition by at.at_ticket_num order by a.a_system_ref) +
dense_rank() over (partition by at.at_ticket_num order by a.a_system_ref desc) -
1 as sysrefs,
with data as (
<your query plus one column>,
case when
min() over (partition by t.t_reference)
<>
max() over (partition by t.t_reference)
then 1 end as dup
)
select * from data where dup = 1

SQL return the max version of each documents

I have duplicated document number, value and version
I need to return only the lines with the document with the max version
Document|value|version
A20|100|1
A20|200|2
A24|100|1
A24|300|2
A24|200|3
A25|100|1
A26|100|1
expected result to return only the last document version
Document|value|version
A20|200|2
A24|200|3
A25|100|1
A26|100|1
Here is what I did but it return everything and not only the max version of the documents
SELECT MAX(FACT.VERSION), FACT.DOCUMENT, FACT.VALUE
FROM PUBLIC.FACT FACT
GROUP BY FACT.DOCUMENT, FACT.VALUE, FACT.VERSION
SELECT FACT.*
FROM PUBLIC.FACT FACT
INNER JOIN
(SELECT FACT2.DOCUMENT, MAX(FACT2.VERSION) AS HVERSION
FROM PUBLIC.FACT FACT2
GROUP BY FACT2.DOCUMENT) HV
ON HV.DOCUMENT = FACT.DOCUMENT
AND HV.HVERSION = FACT.VERSION
Try this:
SELECT FACT.Document, FACT.value, FACT.version
From (
SELECT *, ROW_NUMBER() OVER(partition by Document order by FACT.value DESC) as rn
FROM PUBLIC.FACT
) FAC
WHER rn = 1
This is a common question on SO and appears in business frequently. It's helpful to think of it as a pattern.
Identify the max version by grouping field(s) in a CTE or subquery or temp table
Use the recordset from 1. to join to the original table on the grouping fields and max_version = version
Like so:
SELECT X.*
FROM PUBLIC.FACT X
INNER JOIN (SELECT DOCUMENT, MAX(VERSION) AS MAX_VERSION
FROM PUBLIC.FACT
GROUP BY DOCUMENT) Y ON X.DOCUMENT = Y.DOCUMENT AND X.VERSION = Y.MAX_VERSION;
dbfiddle.uk

How to include column not included in Group By

I have the table DirectCosts with the following columns:
DetailsID (unique)
InvoiceNumber
ProjectID
PayableID
I need to find the duplicates combinations of payableid and invoicenumber.
How can I adjust the following query so that it accommodates the combination AND displays the list of instead of the count?
SELECT sinvoicenumber, count(*)
FROM exportdirectcostdetails where iprocoreprojectid = 1187294
GROUP BY sinvoicenumber
HAVING COUNT(*) > 2
Is there a way it can display all columns?
Original Question : Why do I get error ed2 should have column name defined
You are having a derived table, so you need to have column names for the derived table.
select ed1.sinvoicenumber,
ed1.ipayableid,
ed2.sinvoicenumber
from ExportDirectCostDetails ed1
inner join
(
SELECT sinvoicenumber, count(sinvoicenumber) AS InvoiceNumberCount
FROM exportdirectcostdetails
where iprocoreprojectid = 1187294
GROUP BY sinvoicenumber
HAVING COUNT(*) > 2
) ed2
on ed1.sinvoicenumber = ed2.sinvoicenumber
Updated Question: How to have all column names
You need to have PARTITION BY clause defined and then apply filter as given below:
SELECT t.* FROM
(SELECT *, count(*) OVER(PARTITION BY payableid,invoiceNumber) AS InvoiceCount
FROM exportdirectcostdetails where iprocoreprojectid = 1187294) as t
WHERE InvoiceCount > 1

Multiple results - Need only the latest price

I need to find the latest price for some items
This is my query:
SELECT
MAX("POPORH1"."DATE") as "PO DATE",
"ICSHEH"."DOCNUM",
"ICSHEH"."TRANSDATE",
"ICSHEH"."FISCYEAR",
"ICSHEH"."FISCPERIOD",
"ICSHEH"."REFERENCE",
"ICSHED"."ITEMNO",
"ICSHED"."ITEMDESC",
"ICSHED"."LOCATION",
"ICSHED"."QUANTITY",
"ICSHED"."UNIT",
"POPORL"."UNITCOST"
FROM (("CABDAT"."dbo"."ICSHEH" "ICSHEH"
INNER JOIN
"CABDAT"."dbo"."ICSHED" "ICSHED" ON "ICSHEH"."SEQUENCENO"="ICSHED"."SEQUENCENO")
INNER JOIN "CABDAT"."dbo"."POPORL" "POPORL" ON "ICSHED"."ITEMNO"="POPORL"."ITEMNO")
INNER JOIN "CABDAT"."dbo"."POPORH1" "POPORH1" ON "POPORL"."PORHSEQ"="POPORH1"."PORHSEQ"
WHERE "ICSHED"."SEQUENCENO"=55873
group by
"ICSHEH"."DOCNUM",
"ICSHEH"."TRANSDATE",
"ICSHEH"."FISCYEAR",
"ICSHEH"."FISCPERIOD",
"ICSHEH"."REFERENCE",
"ICSHED"."ITEMNO",
"ICSHED"."ITEMDESC",
"ICSHED"."LOCATION",
"ICSHED"."QUANTITY",
"ICSHED"."UNIT",
"POPORL"."UNITCOST"
This query returns multiple results
These are the results:
"PODATE"='20180405' "ITEMNO"='2944' "UNITCOST"='0.266750'
"PODATE"='20180405' "ITEMNO"='2946' "UNITCOST"='0.266750'
"PODATE"='20170208' "ITEMNO"='2944' "UNITCOST"='0.250780'
"PODATE"='20170208' "ITEMNO"='2944' "UNITCOST"='0.250780'
"PODATE"='20170208' "ITEMNO"='2946' "UNITCOST"='0.250780'
"PODATE"='20170208' "ITEMNO"='2946' "UNITCOST"='0.250780'
I need to have only
"PODATE"='20180405' "ITEMNO"='2944' "UNITCOST"='0.266750'
"PODATE"='20180405' "ITEMNO"='2946' "UNITCOST"='0.266750'
I am learning SQL, so please be patient with my ignorance...
Thanks a lot!
You just need row_number().
WITH cte as (
SELECT *, ROW_NUMBER() OVER (PARTITION BY "ITEMNO" ORDER BY "PODATE" DESC) as rn
FROM "ICSHED" -- or join tables
WHERE "ICSHED"."SEQUENCENO"=55873
)
SELECT *
FROM cte where rn = 1
Or if you only need the highest value without any grouping can use TOP 1
SELECT TOP 1 *
FROM "ICSHED" -- or join tables
WHERE "ICSHED"."SEQUENCENO"=55873
ORDER "PODATE" DESC
By my understanding, you want top 2 rows with recent date. so I try this,
select top 2 * from yourtable order by dateCol desc

SQL multi-table query guidance

I have the following query:
SELECT
_RES_COLL_EVM00012.MachineID,
_RES_COLL_EVM00012.Name,
v_GS_NETWORK_ADAPTER_CONFIGUR.IPAddress0,
v_GS_NETWORK_ADAPTER_CONFIGUR.DefaultIPGateway0,
v_GS_NETWORK_ADAPTER_CONFIGUR.TimeStamp,
v_GS_NETWORK_ADAPTER_CONFIGUR.RevisionID
FROM
_RES_COLL_EVM00012
LEFT JOIN v_GS_NETWORK_ADAPTER_CONFIGUR
ON _RES_COLL_EVM00012.MachineID = v_GS_NETWORK_ADAPTER_CONFIGUR.ResourceID
WHERE
v_GS_NETWORK_ADAPTER_CONFIGUR.IPEnabled0 = 1
AND v_GS_NETWORK_ADAPTER_CONFIGUR.IPAddress0 != '0.0.0.0'
AND v_GS_NETWORK_ADAPTER_CONFIGUR.IPAddress0 IS NOT NULL
AND v_GS_NETWORK_ADAPTER_CONFIGUR.DefaultIPGateway0 != '0.0.0.0'
AND v_GS_NETWORK_ADAPTER_CONFIGUR.DefaultIPGateway0 IS NOT NULL
ORDER BY
_RES_COLL_EVM00012.Name ASC,
v_GS_NETWORK_ADAPTER_CONFIGUR.TimeStamp DESC,
v_GS_NETWORK_ADAPTER_CONFIGUR.RevisionID DESC
Which returns something like the following:
MachineID Name IPAddress0 DefaultGatewayIP0 TimeStamp RevisionID
16777323 CTNB21 192.168.17.134 192.168.17.254 9/09/2013 13:07:11 8
16777323 CTNB21 192.168.17.143 192.168.17.254 9/09/2013 13:07:11 6
16777585 CTNB26 192.168.16.106 192.168.16.254 28/10/2013 22:39:55 33
16777585 CTNB26 192.168.16.116 192.168.16.254 28/10/2013 22:39:55 27
Obviously ResourceID is not unique in the table v_GS_NETWORK_ADAPTER_CONFIGUR. What I need to do is display every row from the table _RES_COLL_EVM00012 along with a SINGLE row for each from v_GS_NETWORK_ADAPTER_CONFIGUR.
The row selected from v_GS_NETWORK_ADAPTER_CONFIGUR should be the one with the most recent TimeStamp and the greatest RevisionID.
Note also I do not actually want to select MachineID, TimeStamp or RevisionID, I have just done so to help better explain my request.
One more thing, if a row does not exist in v_GS_NETWORK_ADAPTER_CONFIGUR with a match for the MachineID/ResourceID, I still need to output the Name but with blank values for IPAddress0 and DefaultGatewayIP0
So to clarify I would like the example result set to look like this instead:
Name IPAddress0 DefaultGatewayIP0
CTNB21 192.168.17.134 192.168.17.254
CTNB26 192.168.16.106 192.168.16.254
Try this:
SELECT
--_RES_COLL_EVM00012.MachineID,
_RES_COLL_EVM00012.Name,
ISNULL(v_GS_NETWORK_ADAPTER_CONFIGUR.IPAddress0,'') as IPAddress0,
ISNULL(v_GS_NETWORK_ADAPTER_CONFIGUR.DefaultIPGateway0,'') as DefaultIPGateway0
--v_GS_NETWORK_ADAPTER_CONFIGUR.TimeStamp,
--v_GS_NETWORK_ADAPTER_CONFIGUR.RevisionID
FROM
_RES_COLL_EVM00012
LEFT JOIN v_GS_NETWORK_ADAPTER_CONFIGUR
ON _RES_COLL_EVM00012.MachineID = v_GS_NETWORK_ADAPTER_CONFIGUR.ResourceID
LEFT JOIN (SELECT a.ResourceID,a.RevisionID, MAX(a.TimeStamp) as TimeStamp
FROM v_GS_NETWORK_ADAPTER_CONFIGUR a
join (SELECT ResourceID, MAX(RevisionID) as RevisionID
FROM v_GS_NETWORK_ADAPTER_CONFIGUR
GROUP BY ResourceID) b
ON a.ResourceID=b.ResourceID
GROUP BY a.ResourceID,a.RevisionID
)c
ON v_GS_NETWORK_ADAPTER_CONFIGUR.ResourceID=c.ResourceID
AND v_GS_NETWORK_ADAPTER_CONFIGUR.RevisionID=c.RevisionID
AND v_GS_NETWORK_ADAPTER_CONFIGUR.TimeStamp=c.TimeStamp
WHERE
c.RevisionID IS NOT NULL
ORDER BY
_RES_COLL_EVM00012.Name ASC,
v_GS_NETWORK_ADAPTER_CONFIGUR.TimeStamp DESC,
v_GS_NETWORK_ADAPTER_CONFIGUR.RevisionID DESC
Use DENSE_RANK()OVER(PARTITION BY RevisionID,TimeStamp ORDER BY RevisionID,TimeStamp DESC) in select statement as below.
SELECT *
FROM (SELECT _RES_COLL_EVM00012.MachineID,
_RES_COLL_EVM00012.Name,
v_GS_NETWORK_ADAPTER_CONFIGUR.IPAddress0,
v_GS_NETWORK_ADAPTER_CONFIGUR.DefaultIPGateway0,
v_GS_NETWORK_ADAPTER_CONFIGUR.TimeStamp,
v_GS_NETWORK_ADAPTER_CONFIGUR.RevisionID,
DENSE_RANK() OVER (PARTITION BY RevisionID, TimeStamp
ORDER BY RevisionID, TimeStamp DESC) RowID
FROM _RES_COLL_EVM00012
LEFT JOIN v_GS_NETWORK_ADAPTER_CONFIGUR
ON _RES_COLL_EVM00012.MachineID = v_GS_NETWORK_ADAPTER_CONFIGUR.ResourceID
WHERE v_GS_NETWORK_ADAPTER_CONFIGUR.IPEnabled0 = 1
AND v_GS_NETWORK_ADAPTER_CONFIGUR.IPAddress0 != '0.0.0.0'
AND v_GS_NETWORK_ADAPTER_CONFIGUR.IPAddress0 IS NOT NULL
AND v_GS_NETWORK_ADAPTER_CONFIGUR.DefaultIPGateway0 != '0.0.0.0'
AND v_GS_NETWORK_ADAPTER_CONFIGUR.DefaultIPGateway0 IS NOT NULL
) XYZ
WHERE XYZ.RowID = 1
For more articles on SQL Server please visit SQL Server Basics