SQL query to find Date, TranscationAmount_US,TransactionAmount_UK - sql

We have 2 tables:
Transaction (AccountId,Date,TransactionAmount)
Master(Aid,Country)
We need to find Date, TotalTAmt_US, Total_TAmt_UK
My solution:
Select
Date,
CASE WHEN Country in ('US') THEN SUM(TransAmt) ELSE '0'END AS TotalTAmt_US,
CASE WHEN Country in ('UK') THEN SUM(TransAmt) ELSE '0'END AS TotalTAmt_UK
FROM
(
SELECT
T.Date As Date,
M.Country As Country,
SUM(T.TransAmt) As TransAmt,
FROM
Transaction T JOIN Master M On T.Aid = M.Aid
WHERE Country in ('US','UK')
group by Date,Country
) As T1
group by Date;
Is this right?
Can we use Country in CASE WHEN without pulling it as I do not want to pull it and then group by it.
Advice please.
Thanks.

You have to declare in GROUP BY section all columns that you use in select statement.
So just put your cases to the grouping.
Select
Date,
CASE WHEN Country in ('US') THEN TransAmt ELSE 0 END AS TotalTAmt_US,
CASE WHEN Country in ('UK') THEN TransAmt ELSE 0 END AS TotalTAmt_UK
FROM
(
SELECT
T.Date As Date,
M.Country As Country,
SUM(T.TransAmt) As TransAmt,
FROM
Transaction T JOIN Master M On T.Aid = M.Aid
WHERE Country in ('US','UK')
GROUP BY Date,Country
) As T1
GROUP BY Date, CASE WHEN Country in ('US') THEN TransAmt ELSE 0 END AS
TotalTAmt_US, CASE WHEN Country in ('UK') THEN TransAmt ELSE 0 END
AS TotalTAmt_UK
Additionaly, remove the SUM() function in your case conditions. If you put them to the GROUP BY you can get error:
ORA-00934: Group function is not allowed here.
And, in the end, remove the ticks in zeros in ELSE conditions. You can get another error about incosistens data types.

I believe you wanted to do a conditional aggregation getting the sum for all US based transaction for a day and the sum of all the UK based transactions for the same day. Then you had to move the CASE into the sum(), adding the amount only if the country is the one you look for, otherwise zero.
Also your subquery isn't necessary.
SELECT t.date,
sum(CASE m.country
WHEN 'US' THEN
t.transamt
ELSE
0
END) totaltamt_us,
sum(CASE m.country
WHEN 'UK' THEN
t.transamt
ELSE
0
END) totaltamt_uk
FROM transaction t
INNER JOIN master m
ON m.aid = t.accountid
WHERE m.county IN ('US',
'UK')
GROUP BY t.date;
If you don't insist on having the different sums in different columns, but would also accept rows, it can be as simple as:
SELECT m.country,
t.date,
sum(t.transamt) totaltamt
FROM transaction t
INNER JOIN master m
ON m.aid = t.accountid
WHERE m.county IN ('US',
'UK')
GROUP BY m.country,
t.date;

I think you want conditional aggregation:
SELECT Date,
SUM(CASE WHEN Country in ('US') THEN TransAmt ELSE 0 END) AS TotalTAmt_US,
SUM(CASE WHEN Country in ('UK') THEN TransAmt ELSE 0 END) AS TotalTAmt_UK
FROM Transaction T JOIN
Master M
On T.Aid = M.Aid
WHERE Country in ('US', 'UK')
GROUP BY Date;
Notes:
You do not need two levels of aggregation.
Numbers should not be enclosed in single quotes, so 0, not '0'.

Related

When else with partition by isn't working in redshift queries

I would like to exclude the categories sub_tag1, sub_tag2 and sub_tag3 of tag from the TAG_SALES_by_month but the rest whatever i mentioned in the where condition need to be included in the count. I couldn't achieve the desired result.can anyone help me to achieve the same, which would be very much appreciated.
select o.tag,
o.SOME, o.THING, o.ILIKE, o.date, c.THE, c.MOST,
date_part(month, o.date) as Month,
date_part(day, o.date) as day,
count(o.id) over (partition by day, CUST_Id) as SALE_NO,
count(o.id) over (partition by Month, CUST_Id) as SALE_NO_by_month,
count(case when (tag <> 'sub_tag1' AND tag <> 'sub_tag2' AND tag <> 'sub_tag3') then o.id else 0 END) over (partition by Month, CUST_Id) as TAG_SALES_by_month,
c.id as CUST_Id
from order_info o
left join config c on o.SOME = c.SOME
where date >= '05/01/2021' AND tag in ('sub_tag1', 'sub_tag2', 'sub_tag3', 'sub_tag4', 'sub_tag5',
'sub_tag6') AND ILIKE = 'JACK'
group by o.tag, o.SOME, o.THING, o.ILIKE, o.date, c.THE, c.MOST, CUST_Id, o.id
order by date
Per the comments, the issue here is the that COUNT will return 1 for any value, it counts existence vs not existence of a value/row.
So COUNT(CASE WHEN... ELSE 0...) will still count 1 on the ELSE condition, since 0 is a value that exists.
The solution is to use ELSE NULL or omit the ELSE clause which will default to NULL, because NULL will not be counted.

Can't use column alias in GROUP BY

I can run this in mysql with no problem
SELECT
DATE_FORMAT(trans_date, '%Y-%m') month,
COUNTRY, COALESCE(COUNT(*), 0) trans_count,
COALESCE(SUM(CASE WHEN state ='approved' THEN 1 END), 0) approved_count,
COALESCE(SUM(amount), 0) trans_total_amount,
COALESCE(SUM(CASE WHEN state ='approved' THEN amount END), 0) approved_total_amount
FROM
Transactions
GROUP BY
month, COUNTRY
ORDER BY
month;
but the same query doesn't run in Orcale, I can't use GROUP BY using aggregation alias, and I can't aggregate without using GROUP BY.
I can call subquery over subquery or use CTE, but it is just so tedious.
What is a good query for type of issue?
As mentioned in another answer, You can not add aliases in GROUP BY but you can add aliases in ORDER BY. Also, DATE_FORMAT is MySql function. It is TO_CHAR in Oracle.
So your final query should be as following:
SELECT
TO_CHAR(TRANS_DATE, 'YYYY-MM') AS MONTH,
COUNTRY,
COUNT(*) AS TRANS_COUNT,
SUM(CASE WHEN STATE = 'approved' THEN 1 ELSE 0 END) AS APPROVED_COUNT,
SUM(AMOUNT) AS TRANS_TOTAL_AMOUNT,
SUM(CASE WHEN STATE = 'approved' THEN AMOUNT ELSE 0 END) AS APPROVED_TOTAL_AMOUNT
FROM TRANSACTIONS
GROUP BY TO_CHAR(TRANS_DATE, 'YYYY-MM'), COUNTRY
ORDER BY MONTH;
Oracle doesn't support aliases for the GROUP BY. Also, the COALESCE() is unnecessary in this case:
SELECT DATE_FORMAT(trans_date, '%Y-%m') as month, COUNTRY,
COUNT(*) as trans_count,
SUM(CASE WHEN state ='approved' THEN 1 ELSE 0 END) as approved_count,
SUM(amount) as trans_total_amount,
SUM(CASE WHEN state = 'approved' THEN amount ELSE 0 END) as approved_total_amount
FROM Transactions
GROUP BY DATE_FORMAT(trans_date, '%Y-%m'), COUNTRY
ORDER BY month;

Division between data in rows - SQL

The data in my table looks like this:
date, app, country, sales
2017-01-01,XYZ,US,10000
2017-01-01,XYZ,GB,2000
2017-01-02,XYZ,US,30000
2017-01-02,XYZ,GB,1000
I need to find, for each app on a daily basis, the ratio of US sales to GB sales, so ideally the result would look like this:
date, app, ratio
2017-01-01,XYZ,10000/2000 = 5
2017-01-02,XYZ,30000/1000 = 30
I'm currently dumping everything into a csv and doing my calculations offline in Python but I wanted to move everything onto the SQL side. One option would be to aggregate each country into a subquery, join and then divide, such as
select d1_us.date, d1_us.app, d1_us.sales / d1_gb.sales from
(select date, app, sales from table where date between '2017-01-01' and '2017-01-10' and country = 'US') as d1_us
join
(select date, app, sales from table where date between '2017-01-01' and '2017-01-10' and country = 'GB') as d1_gb
on d1_us.app = d1_gb.app and d1_us.date = d1_gb.date
Is there a less messy way to go about doing this?
You can use the ratio of SUM(CASE WHEN) and GROUP BY in your query to do this without requiring a subquery.
SELECT DATE,
APP,
SUM(CASE WHEN COUNTRY = 'US' THEN SALES ELSE 0 END) /
SUM(CASE WHEN COUNTRY = 'GB' THEN SALES END) AS RATIO
FROM TABLE1
GROUP BY DATE, APP;
Based on the likelihood of the GB sales being zero, you can tweak the GB's ELSE condition, maybe ELSE 1, to avoid Divide by zero error. It really depends on how you want to handle exceptions.
You can use one query with grouping and provide the condition once:
SELECT date, app,
SUM(CASE WHEN country = 'US' THEN SALES ELSE 0 END) /
SUM(CASE WHEN country = 'GB' THEN SALES END) AS ratio
WHERE date between '2017-01-01' AND '2017-01-10'
FROM your_table
GROUP BY date, app;
However, this gives you zero if there are no records for US and NULL if there are no records for GB. If you need to return different values for those cases, you can use another CASE WHEN surrounding the division. For example, to return -1 and -2 respectively, you can use:
SELECT date, app,
CASE WHEN COUNT(CASE WHEN country = 'US' THEN 1 ELSE 0 END) = 0 THEN -1
WHEN COUNT(CASE WHEN country = 'GB' THEN 1 ELSE 0 END) = 0 THEN -2
ELSE SUM(CASE WHEN country = 'US' THEN SALES ELSE 0 END) /
SUM(CASE WHEN country = 'GB' THEN SALES END)
END AS ratio
WHERE date between '2017-01-01' AND '2017-01-10'
FROM your_table
GROUP BY date, app;
DROP TABLE IF EXISTS t;
CREATE TABLE t (
date DATE,
app VARCHAR(5),
country VARCHAR(5),
sales DECIMAL(10,2)
);
INSERT INTO t VALUES
('2017-01-01','XYZ','US',10000),
('2017-01-01','XYZ','GB',2000),
('2017-01-02','XYZ','US',30000),
('2017-01-02','XYZ','GB',1000);
WITH q AS (
SELECT
date,
app,
country,
SUM(sales) AS sales
FROM t
GROUP BY date, app, country
) SELECT
q1.date,
q1.app,
q1.country || ' vs ' || NVL(q2.country,'-') AS ratio_between,
CASE WHEN q2.sales IS NULL OR q2.sales = 0 THEN 0 ELSE ROUND(q1.sales / q2.sales, 2) END AS ratio
FROM q AS q1
LEFT JOIN q AS q2 ON q2.date = q1.date AND
q2.app = q1.app AND
q2.country != q1.country
-- WHERE q1.country = 'US'
ORDER BY q1.date;
Results for any country vs any country (WHERE q1.country='US' is commented out)
date,app,ratio_between,ratio
2017-01-01,XYZ,GB vs US,0.20
2017-01-01,XYZ,US vs GB,5.00
2017-01-02,XYZ,GB vs US,0.03
2017-01-02,XYZ,US vs GB,30.00
Results for US vs any other country (WHERE q1.country='US' uncommented)
date,app,ratio_between,ratio
2017-01-01,XYZ,US vs GB,5.00
2017-01-02,XYZ,US vs GB,30.00
The trick is in JOIN clause.
Results of a subquery q which aggregates data by date, app and country are joined with results themselves but on date and app.
This way, for every date, app and country we get a "match" with any another country on same date and app. By adding q1.country != q2.country, we exclude results for same country, highlighted below with *
date,app,country,sales,date,app,country,sales
*2017-01-01,XYZ,GB,2000.00,2017-01-01,XYZ,GB,2000.00*
2017-01-01,XYZ,GB,2000.00,2017-01-01,XYZ,US,10000.00
2017-01-01,XYZ,US,10000.00,2017-01-01,XYZ,GB,2000.00
*2017-01-01,XYZ,US,10000.00,2017-01-01,XYZ,US,10000.00*
2017-01-02,XYZ,GB,1000.00,2017-01-02,XYZ,US,30000.00
*2017-01-02,XYZ,GB,1000.00,2017-01-02,XYZ,GB,1000.00*
*2017-01-02,XYZ,US,30000.00,2017-01-02,XYZ,US,30000.00*
2017-01-02,XYZ,US,30000.00,2017-01-02,XYZ,GB,1000.00

Group by T-SQL vs. MySQL (single column)

I am new to SQL Server/used to MySQL databases and I am running into an issue that I never ran into with MySQL. I am looking to pull all current policy numbers, the name of the company/person it belongs to, their total premium, and whether or not they have what we call 'equipment breakdown' coverage. This is all pretty simple, the issue I am having is with grouping. I want to group by one column only, aka one distinct policy number, the company name, a sum of the premium (it is possible to have several premium amounts both negative and positive so I want to sum these to see what the true total is), and a simple Yes or No column for equipment breakdown.
Here is the query I am running:
SELECT pol_num as policy_number,
insd_name as insureds_name,
SUM(amt) as 'total_premium',
(SELECT
CASE
WHEN cvg_desc = 'Equipment Breakdown'
THEN 'Y'
ELSE 'N'
END) as 'equipment_breakdown'
FROM bapu.dbo.fact_prem
WHERE '2014-05-06' between d_pol_eff and d_pol_exp
AND amt_type = 'Premium'
AND amt_desc = 'Written Premium'
GROUP BY pol_num
ORDER BY policy_number
I get the an error saying that I need to group by insd_name and cvg_desc as well, but I DON'T want that as it gives me duplicate policy numbers.
Here is an example of what I get when I group everything it tells me to:
policy_number insureds_name total_premium equipment_breakdown
001 company a 0.00 n
001 company a 25,000.00 n
001 company a -10,000.00 n
002 company b 100.00 y
002 company b 10,000.00 y
Here is an example of the results I want:
policy_number insureds_name total_premium equipment_breakdown
001 company a 15,000.00 n
002 company b 10,100.00 y
Basically, I just want to group by the policy number and sum the premium amounts. Above is how I would achieve this in MySQL, how can I achieve the results I am looking for in SQL Server?
Thanks
MySQL doesn't require all non-aggregate fields to be included in the GROUP BY clause, even though not doing so can yield unexpected results. SQL Server requires this, so you are forced to decide how you want to handle multiple insd_name values for a given pol_num, you can use MAX(), MIN(), or if the values are always the same, just add them to your GROUP BY:
SELECT pol_num AS policy_number
, MAX(insd_name) AS insureds_name
, SUM(amt) AS 'total_premium'
, MAX(CASE WHEN cvg_desc = 'Equipment Breakdown' THEN 'Y'
ELSE 'N'
END) AS 'equipment_breakdown'
FROM bapu.dbo.fact_prem
WHERE '2014-05-06' BETWEEN d_pol_eff AND d_pol_exp
AND amt_type = 'Premium'
AND amt_desc = 'Written Premium'
GROUP BY pol_num
ORDER BY policy_number
Or:
SELECT pol_num AS policy_number
, insd_name AS insureds_name
, SUM(amt) AS 'total_premium'
, CASE WHEN cvg_desc = 'Equipment Breakdown' THEN 'Y'
ELSE 'N'
END AS 'equipment_breakdown'
FROM bapu.dbo.fact_prem
WHERE '2014-05-06' BETWEEN d_pol_eff AND d_pol_exp
AND amt_type = 'Premium'
AND amt_desc = 'Written Premium'
GROUP BY pol_num
, insd_name
, CASE WHEN cvg_desc = 'Equipment Breakdown' THEN 'Y'
ELSE 'N'
END
ORDER BY policy_number
It looks like the cvg_desc column is probably what's messing you up. You want to group by the resulting Y or N from your CASE statement, but SQL server is grouping by the original cvg_desc column. You could approach this in a way that resolves the CASE statement before it groups. For example, wrap the main query in a common table expression (CTE), which is sort of like an inline-view. Then with the equipment breakdown column reduced to just a Y or an N, a subsequent query from the CTE with your SUM aggregation on premium should give you the results you desire:
WITH Policies(policy_number, insureds_name, premium, equipment_breakdown) AS
(
SELECT
pol_num
,insd_name
,amt
,(CASE WHEN cvg_desc = 'Equipment Breakdown' THEN 'Y' ELSE 'N' END)
AS 'equipment_breakdown'
FROM
bapu.dbo.fact_prem
WHERE
'2014-05-06' BETWEEN d_pol_eff AND d_pol_exp
AND
amt_type = 'Premium'
AND
amt_desc = 'Written Premium'
)
SELECT
policy_number
,insureds_name
,SUM(premium) AS total_premium
,equipment_breakdown
FROM
Policies
GROUP BY
policy_number
,insureds_name
,equipment_breakdown
You'll need an aggregate function on the fields you don't want to group by. A simple one to use is MAX which works with most types;
SELECT pol_num as policy_number,
MAX(insd_name) as insureds_name,
SUM(amt) as 'total_premium',
(SELECT
CASE
WHEN MAX(cvg_desc) = 'Equipment Breakdown'
THEN 'Y'
ELSE 'N'
END) as 'equipment_breakdown'
FROM bapu.dbo.fact_prem
WHERE '2014-05-06' between d_pol_eff and d_pol_exp
AND amt_type = 'Premium'
AND amt_desc = 'Written Premium'
GROUP BY pol_num
ORDER BY policy_number
The reason SQL Server wants this is that it likes to give deterministic answers, for example
column_a | column_b
1 | 1
1 | 2
...grouped by only column_a would in MySQL give either 1 or 2 as an answer for column_b, while SQL Server wants you to tell it explicitly which one to use.
I would probably write this as below -- did not test
SELECT pol_num as policy_number,
insd_name as insureds_name,
SUM(amt) as total_premium
CASE
WHEN cvg_desc = 'Equipment Breakdown'
THEN 'Y'
ELSE 'N'
END as equipment_breakdown
FROM bapu.dbo.fact_prem
WHERE '2014-05-06' between d_pol_eff and d_pol_exp
AND amt_type = 'Premium'
AND amt_desc = 'Written Premium'
GROUP BY
pol_num, policy_number,
CASE
WHEN cvg_desc = 'Equipment Breakdown'
THEN 'Y'
ELSE 'N'
END
ORDER BY policy_number

Using case to create multiple columns of data

I am trying to create a query in MS SQL 2005 that will return data for 4 date ranges as separate columns in my results set.
Right now my query looks like the query below. It works fine, however I want to add the additional columns for each date range since it currently supports one date range when.
This would then return a total1,total2, total3 and total 4 column instead of a single total column like the current query below. Each total would represent the 4 date ranges:
I am fairly sure this can be accomplished using case statements, but am not 100%.
Any help would be certainly appreciated.
SELECT
vendor,location,
sum(ExtPrice) as total
FROM [database].[dbo].[saledata]
where processdate between '2010-11-03' and '2010-12-14'
and location <>''
and vendor <> ''
group by vendor,location with rollup
I usually do it like this:
SELECT
vendor,location,
sum(CASE WHEN processdate BETWEEN #date1start AND #date1end THEN xtPrice ELSE 0 END) as total,
sum(CASE WHEN processdate BETWEEN #date2start AND #date2end THEN xtPrice ELSE 0 END) as total2,
sum(CASE WHEN processdate BETWEEN #date3start AND #date3end THEN xtPrice ELSE 0 END) as total3,
sum(CASE WHEN processdate BETWEEN #date4start AND #date4end THEN xtPrice ELSE 0 END) as total4
FROM [database].[dbo].[saledata]
and location <>''
and vendor <> ''
group by vendor,location with rollup
And you can change the WHEN portion to make your desired date ranges.
Use Subqueries, ie
select sd.vendor, sd.location, sd1.total, sd2.total, sd3.total, sd4.total
from (select distinct vendor, location from saledata) AS sd
LEFT JOIN (
SELECT vendor,location, sum(ExtPrice) as total
FROM [database].[dbo].[saledata]
where processdate between 'startdate1' and 'enddate1'
and location <>''
and vendor <> ''
group by vendor,location with rollup) sd1 on sd1.vendor=sd.vendor and sd1.location=sd.location
LEFT JOIN (
SELECT vendor,location, sum(ExtPrice) as total
FROM [database].[dbo].[saledata]
where processdate between 'startdate2' and 'enddate2'
and location <>''
and vendor <> ''
group by vendor,location with rollup) sd2 on sd2.vendor=sd.vendor and sd2.location=sd.location
LEFT JOIN (
SELECT vendor,location, sum(ExtPrice) as total
FROM [database].[dbo].[saledata]
where processdate between 'startdate3' and 'enddate3'
and location <>''
and vendor <> ''
group by vendor,location with rollup) sd3 on sd3.vendor=sd.vendor and sd3.location=sd.location
LEFT JOIN (
SELECT vendor,location, sum(ExtPrice) as total
FROM [database].[dbo].[saledata]
where processdate between 'startdate4' and 'enddate4'
and location <>''
and vendor <> ''
group by vendor,location with rollup) sd4 on sd4.vendor=sd.vendor and sd4.location=sd.location