How to include column not included in Group By - sql

I have the table DirectCosts with the following columns:
DetailsID (unique)
InvoiceNumber
ProjectID
PayableID
I need to find the duplicates combinations of payableid and invoicenumber.
How can I adjust the following query so that it accommodates the combination AND displays the list of instead of the count?
SELECT sinvoicenumber, count(*)
FROM exportdirectcostdetails where iprocoreprojectid = 1187294
GROUP BY sinvoicenumber
HAVING COUNT(*) > 2
Is there a way it can display all columns?

Original Question : Why do I get error ed2 should have column name defined
You are having a derived table, so you need to have column names for the derived table.
select ed1.sinvoicenumber,
ed1.ipayableid,
ed2.sinvoicenumber
from ExportDirectCostDetails ed1
inner join
(
SELECT sinvoicenumber, count(sinvoicenumber) AS InvoiceNumberCount
FROM exportdirectcostdetails
where iprocoreprojectid = 1187294
GROUP BY sinvoicenumber
HAVING COUNT(*) > 2
) ed2
on ed1.sinvoicenumber = ed2.sinvoicenumber
Updated Question: How to have all column names
You need to have PARTITION BY clause defined and then apply filter as given below:
SELECT t.* FROM
(SELECT *, count(*) OVER(PARTITION BY payableid,invoiceNumber) AS InvoiceCount
FROM exportdirectcostdetails where iprocoreprojectid = 1187294) as t
WHERE InvoiceCount > 1

Related

Count Distinct values in one column based on other column

I am trying to count distinct values on Z_l based on value by using with clause. Sample data exercise included below.
please look at the picture, the distinct values of Z_l based on X='ny'
with distincz_l as (select ny.X, ny.z_l o.cnt From HOPL ny join (select X, count(*) as cnt from HOPL group by X) o on (ny.X = o.Z_l)) select * from HOPL;
You don't even need a WITH clause, since you just need one single sentence:
SELECT z_l, count(1)
FROM hopl
WHERE x='ny'
GROUP BY z_l
;

SQL query to return duplicate rows for certain column, but with unique values for another column

I have written the query shown here that combines three tables and returns rows where the at_ticket_num from appeal_tickets is duplicated but against a different at_sys_ref value
select top 100
t.t_reference, at.at_system_ref, at_ticket_num, a.a_case_ref
from
tickets t, appeal_tickets at, appeals_2 a
where
t.t_reference in ('AB123','AB234') -- filtering on these values so that I can see that its working
and t.t_number = at.at_ticket_num
and at.at_system_ref = a.a_system_ref
and at.at_ticket_num IN (select at_ticket_num
from appeal_tickets
group by at_ticket_num
having count(distinct at_system_ref) > 1)
order by
t.t_reference desc
This is the output:
t_reference at_system_ref at_ticket_num a_case_ref
-------------------------------------------------------
AB123 30838974 23641583 1111979010
AB123 30838976 23641583 1111979010
AB234 30839149 23641520 1111977352
AB234 30839209 23641520 1111988003
I want to modify this so that it only returns records where t_reference is duplicated but against a different a_case_ref. So in above case only records for AB234 would be returned.
Any help would be much appreciated.
You want all ticket appeals that have more than one system reference and more than one case reference it seems. You can join the tables, count the occurrences per ticket and then only keep the tickets that match these criteria.
select *
from
(
select
t.t_reference, at.at_system_ref, at.at_ticket_num, a.a_case_ref,
count(distinct a.a_system_ref) over (partition by at.at_ticket_num) as sysrefs,
count(distinct a.a_case_ref) over (partition by at.at_ticket_num) as caserefs
from tickets t
join appeal_tickets at on at.at_ticket_num = t.t_number
join appeals_2 a on a.a_system_ref = at.at_system_ref
) counted
where sysrefs > 1 and caserefs > 1
order by t.t_reference, at.at_system_ref, at.at_ticket_num, a.a_case_ref;
Correction
It seems that SQL Server still doesn't support COUNT(DISTINCT ...) OVER (...). You can count distinct values in a subquery though. Replace
count(distinct a.a_system_ref) over (partition by at.at_ticket_num) as sysrefs,
by
(
select count(distinct a2.a_system_ref)
from appeal_tickets at2
join appeals_2 a2 on a2.a_system_ref = at2.at_system_ref
where at2.at_ticket_num = t.t_number
) as sysrefs,
An alternative workaround is to use DENSE_RANK in two directions (found here: https://stackoverflow.com/a/53518204/2270762):
dense_rank() over (partition by at.at_ticket_num order by a.a_system_ref) +
dense_rank() over (partition by at.at_ticket_num order by a.a_system_ref desc) -
1 as sysrefs,
with data as (
<your query plus one column>,
case when
min() over (partition by t.t_reference)
<>
max() over (partition by t.t_reference)
then 1 end as dup
)
select * from data where dup = 1

querying data with sqlite

I have data in an sqlite db that contains the following columns:
date | name | id | code
all as TEXT (I sourced it from a csv file) and I want to build a query that finds all names that have code ABC120 but not ABC306 nor ABC305 on the same date and group the result GROUP BY name.
How do I do this?
If you want to use GROUP BY you must group by name, date first and set the conditions in the HAVING clause, but also you must use DISTINCT so the results do not contain duplicate names:
select distinct name
from tablename
group by name, date
having sum(code = 'ABC120') > 0 and sum(code in ('ABC305', 'ABC306')) = 0;
You can get the same results with EXISTS:
select distinct t.name
from tablename t
where t.code = 'ABC120'
and not exists (select 1 from tablename where name = t.name and date = t.date and code in ('ABC305', 'ABC306'))
You can use having:
select date, name
from t
where code in ('ABC120', 'ABC306', 'ABC305')
group by date, name
having min(code) = 'ABC120' and max(code) = 'ABC120';
Note: because of the three codes you chose, you could just use max(code) = 120. However, that does not generalize to other code values.

Adding new column of total_event

I want to append virtual column in SELECT result with the name of total_event which will be total of same type of wait_event_type, As shown in the screenshot I want to sum 'Lock' which will be 18+2 = 20 and add that against all Lock type column.
I have a event_stats table with three columns wait_event_type, wait_event, event_count which holds all the data.
You can use a window function to do this:
SELECT
wait_event_type,
wait_event,
event_count,
SUM(event_count) OVER (PARTITION BY wait_event_type) AS total_event_count
FROM my_table
You can also use group by clause and join
select m.wait_event_type,
m.wait_event,
m.event_count,
t.total_event_count from (select wait_event_type,SUM(event_count) as total_event_count
from my_table group by wait_event_type)t join my_table m on
m.wait_event_type=t.wait_event_type

Group by not working to get count of a column with other max record in sql

I have a table named PublishedData, see image below
I'm trying to get the output like, below image
I think you can use a query like this:
SELECT dt.DistrictName, ISNULL(dt.Content, 'N/A') Content, dt.UpdatedDate, mt.LastPublished, mt.Unpublished
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY DistrictName ORDER BY UpdatedDate DESC, ISNULL(Content, 'zzzzz')) seq
FROM PublishedData) dt
INNER JOIN (
SELECT DistrictName, MAX(LastPublished) LastPublished, COUNT(CASE WHEN IsPublished = 0 THEN 1 END) Unpublished
FROM PublishedData
GROUP BY DistrictName) mt
ON dt.DistrictName = mt.DistrictName
WHERE
dt.seq = 1;
Because I think you use an order over UpdatedDate, Content to gain you two first columns.
Check out something like this (I don't have your tables, but you will get the idea where to follow with your query):
SELECT DirectName,
MAX(UpdatedDate),
MAX(LastPublished),
(
SELECT COUNT(*)
FROM PublishedData inr
WHERE inr.DirectName = outr.DirectName
AND inr.IsPublished = 0
) AS Unpublished
FROM PublishedData outr
GROUP BY DirectName
We should required a unique identity for that required output in PublishedData Table,Because We can't get the Latest content from given Schema.
If you want data apart from content like DistictName,updatedDate,LastPublishedDate and count of Unpublished records ,Please use Query given below :
select T1.DistrictName,T1.UpdatedDate,T1.LastPublished,T2.Unpublished from
(select DistrictName,Max(UpdateDate) as UpdatedDate,Max(LastPublished) as LastPublished from PublishedData group by DistrictName) T1
inner join
(select DistrictName,count(IsPublished) as Unpublished from PublishedData where isPublished=0 group by DistrictName) T2 ON T1.DistrictName=T2.DistrictName ORDER BY T2.Unpublished DESC