SQL Query group by - sql

Pasted the sql query that is being used currently to generate the report from unix shell script. I want to have an addition column COUNT (cdw.file_id) writeoffcnt to the below sql query which displays the
count of the the record that matches mg_disp_status=0 and mig_disp_code =3 .
The existing field COUNT (cdw.file_id) cnt should have the count of record that matches
mg_disp_status = 1 and mig_disp_code <> 2. How can i modify the query?
SELECT fs.file_id,
fs.file_id_serv,
fs.file_process_dt,
fs.file_name,
fs.total_records,
RTRIM (d.description_text) source,
SUM(amount),
COUNT (cdw.file_id) cnt
FROM file_status fs,
dr_data_work cdw,
descriptions d,
contacts ec
WHERE file_process_dt >= TO_DATE ('${START_DATE}', 'DD-MON-YYYY')
AND file_process_dt < TO_DATE ('${END_DATE}', 'DD-MON-YYYY')
AND fs.ext_contact_id = ec.ext_contact_id
--
AND ec.description_code = d.description_code
AND cdw.file_id = fs.file_id
AND mg_disp_status = 1
AND mig_disp_code <> 2
GROUP BY fs.file_id,
fs.file_id_serv,
fs.file_process_dt,
fs.file_name,
fs.total_records,
RTRIM (d.description_text);

I don't fully understand all your permutations of requirements, but something like the following should work:
SELECT
.
.
.
SUM(CASE mg_disp_status=0 and mig_disp_code =3 THEN 1 ELSE 0 END) cnt1,
SUM(CASE mg_disp_status=1 and mig_disp_code <> 2 THEN 1 ELSE 0 END) cnt2,

Join twice with different aliases and do your counts:
SELECT fs.file_id,
fs.file_id_serv,
fs.file_process_dt,
fs.file_name,
fs.total_records,
RTRIM (d.description_text) source,
SUM(amount),
COUNT (cdw.file_id) cnt1, COUNT (cdw2.file_id) cnt2
FROM file_status fs,
dr_data_work cdw, dr_data_work cdw2,
descriptions d,
contacts ec
WHERE file_process_dt >= TO_DATE ('${START_DATE}', 'DD-MON-YYYY')
AND file_process_dt < TO_DATE ('${END_DATE}', 'DD-MON-YYYY')
AND fs.ext_contact_id = ec.ext_contact_id
--
AND ec.description_code = d.description_code
AND cdw.file_id = fs.file_id AND cdw2.file_id = fs.file_id
AND cdw.mg_disp_status = 1 AND cdw2.mg_disp_status = 0
AND cdw.mig_disp_code <> 2 AND cdw2.mg_disp_code = 3
GROUP BY fs.file_id,
fs.file_id_serv,
fs.file_process_dt,
fs.file_name,
fs.total_records,
RTRIM (d.description_text);

Related

Combine 2 queries together

I am struggling to work out combining a query that should give me 3 columns of Month, total_sold_products and drinks_sold_products
Query 1:
Select month(date), count(id) as total_sold_products
from Products
where date between '2022-01-01' and '2022-12-31'
Query 2
Select month(date), count(id) as drinks_sold_products
from Products where type = 'drinks' and date between '2022-01-01' and '2022-12-31'
I tried the union function but it summed count(id) twice and gave me only 2 columns
Many thanks!
Union is for attaching sets of data on top of each other. You need conditional aggregation or a join. See below.
SELECT MONTH(date),
COUNT(*) AS total_sold_products,
COUNT(CASE WHEN type = 'drinks' THEN 1 ELSE 0 END) AS drinks_sold_products,
FORMAT((CASE
WHEN COUNT(*) > 0 THEN
COUNT(CASE WHEN type = 'drinks' THEN 1 ELSE 0 END)/COUNT(*)
ELSE 0 END),
'P') AS Percentage
FROM Products
WHERE date BETWEEN'2022-01-01' AND '2022-12-31'
GROUP BY MONTH(date)

Sql out put which come into multiple row converted to single row

This query give multiple row which needs to be shown in single row. Please help.
SELECT blng_serv_code, (COUNT (blng_serv_code)) AS total ,
DECODE (package_trx_yn, 'Y', 'PKG', 'N', 'NPKG') pkg_status FROM bl_patient_charges_folio
WHERE operating_facility_id = 'MC'
AND trx_date >= TO_DATE ('10/10/2019 00:00:00', 'MM/DD/YYYY HH24:MI:SS')AND blng_serv_code = 'LBSB000015'
GROUP BY blng_serv_code, package_trx_yn
If you want the value in a single row, leave out the package status:
SELECT blng_serv_code, COUNT(*) AS total
FROM bl_patient_charges_folio
WHERE operating_facility_id = 'MC' AND
trx_date >= DATE '2019-10-10' AND
blng_serv_code = 'LBSB000015'
GROUP BY blng_serv_code;
If you do want the package status, then you need to explain the logic for including it "on a single row".
EDIT:
It sounds like you want the values in separate columns:
SELECT blng_serv_code, COUNT(*) AS total,
SUM(CASE WHEN package_trx_yn = 'Y' THEN 1 ELSE 0 END) as pkg_cnt,
SUM(CASE WHEN package_trx_yn = 'N' THEN 1 ELSE 0 END) as npkg_cnt
FROM bl_patient_charges_folio
WHERE operating_facility_id = 'MC' AND
trx_date >= DATE '2019-10-10' AND
blng_serv_code = 'LBSB000015'
GROUP BY blng_serv_code;

Why my CASE WHEN gave me an AGGREGATION error message?

I'm trying to make a promo grouping using one promo_code field in a month where there's a chance that a single customer_ID would have more than one transaction and could have two different promo code
SELECT customer_id AS buyer,
CASE
WHEN COUNT(DISTINCT flag_promo) = 2 THEN 'Mixed'
WHEN COUNT(DISTINCT flag_promo) = 1 AND flag_promo = 1 THEN 'Promo'
WHEN COUNT(DISTINCT flag_promo) = 1 AND flag_promo = 0 THEN 'Organic'
END AS promo_group
FROM TABLE
WHERE DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2
It gave me an error message :
SELECT list expression references column flag_promo which is neither grouped nor aggregated at [4:41]
Below is for BigQuery Standard SQL
#standardSQL
SELECT customer_id AS buyer,
CASE
WHEN COUNT(DISTINCT flag_promo) > 1 THEN 'Mixed'
WHEN ANY_VALUE(flag_promo) = 1 THEN 'Promo'
WHEN ANY_VALUE(flag_promo) = 2 THEN 'Organic'
END AS promo_group
FROM `project.dataset.table`
WHERE DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2
This is the query I think you intended to do:
SELECT
customer_id AS buyer,
CASE WHEN COUNT(DISTINCT flag_promo) = 2 THEN 'Mixed'
WHEN COUNT(DISTINCT flag_promo) = 1 AND MIN(flag_promo) = 1 THEN 'Promo'
WHEN COUNT(DISTINCT flag_promo) = 1 AND MIN(flag_promo) = 2 THEN 'Organic'
END AS promo_group
FROM TABLE
WHERE
DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2;
This assumes that a flag_promo value of 1 means Promo and a value of 2 means Organic. If not, then we can easily edit the above query.

How to improve slow sql query with aggregate functions

I want to show top ten customers,sales,margin where customers is registred during this accounting year. The query takes about 65seconds to run and it is not accepted :-(
As you may see i am not good at sql and will be very happy for help to improve the query.
SELECT Top 10
AcTr.R3, Actor.Nm,
SUM(CASE WHEN AcTr.AcNo<='3999' THEN AcAm*-1 ELSE 0 END) AS Sales ,
SUM(AcAm*-1) AS TB
FROM AcTr, Actor
WHERE (Actor.CustNo = AcTr.R3) AND
(Actor.CustNo <> '0') AND
(Actor.CreDt >= '20180901') AND
(Actor.CreDt <= '20190430') AND
AcTr.AcYr = '2018' AND
AcTr.AcPr <= '8' AND
AcTr.AcNo>='3000' AND
AcTr.AcNo <= '4999'
GROUP BY AcTr.R3, Actor.Nm
ORDER BY Sales DESC
Welcome to the community. You have a good start, but future, it is more helpful if you can provide (as commented), the CREATE table declarations so users know the actual data types. Not always required, but helps.
As for your query layout, it is more common to show the JOIN syntax instead of WHERE showing relations between tables, but that comes in time and practice.
Indexes help and should be based on a combination of both WHERE/JOIN criteria AND Grouping fields. Also, if fields are numeric, then do not 'quote' them, just leave as numbers. For example, your AcYr, AcPr, AcNo. I would think that an account number really would be a string value vs number for accounting purposes.
I would suggest the following indexes on your tables
Table Index
Actr ( AcYr, AcPr, AcNo, R3 )
Actor ( CustNo, CreDt )
The Actr table I have the filtering criteria first and the R3 last to help optimize the GROUP BY. The Actor table by the customer number, then the CreDt (Create date??), and is it really a string, or is it a date field? If so, the date criteria would be something like '2018-09-01' and '2019-04-30'
select TOP 10
Actor.Nm,
PreSum.Sales,
PreSm.TB
from
( select
R3,
SUM(CASE WHEN AcTr.AcNo <= '3999'
THEN AcAm * -1 ELSE 0 END) AS Sales,
SUM( AcAm * -1) AS TB
from
Actr
where
AcTr.AcYr = 2018
AND AcTr.AcPr <= 8
AND AcTr.AcNo >= '3000'
AND AcTr.AcNo <= '4999'
GROUP BY
AcTr.R3 ) PreSum
JOIN Actor
on PreSum.R3 = Actor.CustNo
AND Actor.CustNo <> 0
AND Actor.CreDt >= '20180901'
AND Actor.CreDt <= '20190430'
order by
Sales DESC
Per latest inquiry / comment, wanting by year comparison and getting rid of the top 10 performers per a given time period.
select
Actor.Nm,
PreSum.Sales2018,
PreSum.Sales2019,
PreSum.TB2018,
PreSum.TB2019
from
( select
AcTr.R3,
SUM(CASE WHEN AcTr.AcYr = 2018
AND AcTr.AcNo <= '3999'
THEN AcAm * -1 ELSE 0 END) AS Sales2018,
SUM(CASE WHEN AcTr.AcYr = 2019 AND AcTr.AcNo <= '3999'
THEN AcAm * -1 ELSE 0 END) AS Sales2019,
SUM( CASE WHEN AcTr.AcYr = 2018
THEN AcAm * -1 else 0 end ) AS TB2018
SUM( CASE WHEN AcTr.AcYr = 2019
THEN AcAm * -1 else 0 end ) AS TB2019
from
Actr
where
AcTr.AcYr IN ( 2018, 2019 )
AND AcTr.AcPr <= 8
AND AcTr.AcNo >= '3000'
AND AcTr.AcNo <= '4999'
GROUP BY
AcTr.R3 ) PreSum
JOIN Actor
on PreSum.R3 = Actor.CustNo
AND Actor.CustNo <> 0
AND Actor.CreDt >= '20180901'
AND Actor.CreDt <= '20190430'
order by
Sales DESC

counting events over flexible ranges

I am trying to count events (which are rows in the event_table) in the year before and the year after a particular target date for each person. For example, say I have a person 100 and target date is 10/01/2012. I would like to count events in 9/30/2011-9/30/2012 and in 10/02/2012-9/30/2013.
My query looks like:
select *
from (
select id, target_date
from subsample_table
) as i
left join (
select id, event_date, count(*) as N
, case when event_date between target_date-365 and target_date-1 then 0
when event_date between target_date+1 and target_date+365 then 1
else 2 end as after
from event_table
group by id, target_date, period
) as h
on i.id = h.id
and i.target_date = h.event_date
The output should look something like:
id target_date after N
100 10/01/2012 0 1000
100 10/01/2012 1 0
It's possible that some people do not have any events in the before or after periods (or both), and it would be nice to have zeros in that case. I don't care about the events outside the 730 days.
Any suggestions would be greatly appreciated.
I think the following may approach what you are trying to accomplish.
select id
, target_date
, event_date
, count(*) as N
, SUM(case when event_date between target_date-365 and target_date-1
then 1
else 0
end) AS Prior_
, SUM(case when event_date between target_date+1 and target_date+365
then 1
else 0
end) as After_
from subsample_table i
left join
event_table h
on i.id = h.id
and i.target_date = h.event_date
group by id, target_date, period
This is a generic answer. I don't know what date functions teradata has, so I will use sql server syntax.
select id, target_date, sum(before) before, sum(after) after, sum(righton) righton
from yourtable t
join (
select id, target_date td
, case when yourdate >= dateadd(year, -1, target_date)
and yourdate < target_date then 1 else 0 end before
, case when yourdate <= dateadd(year, 1, target_date)
and yourdate > target_date then 1 else 0 end after
, case when yourdate = target_date then 1 else 0 end righton
from yourtable
where whatever
group by id, target_date) sq on t.id = sq.id and target_date = dt
where whatever
group by id, target_date
This answer assumes that an id can have more than one target date.