SQL query to use group by to get the sum of two different columns within a date range - sql

I have two tables time track and absence for an employee.
person_number Measure start_Date end_date Time_type
73636 10 01-Jan-2020 02-Jan-2020 Double
73636 24 06-Jan-2020 08-jan-2020 Double
73636 10 15-Jan-2020 25-Jan-2020 Regular Pay
73636 11.9 06-Jan-2020 08-jan-2020 Double
73636 27 10-Jan-2020 15-Jan-2020 Regular Pay
Absence det
person_number start_Date end_date duration Absence_type
73636 05-Jan-2020 10-Jan-2020 10 Vacation
73636 06-Jan-2020 18-jan-2020 9 Paid Leave
73636 20-Jan-2020 21-jan-2020 1 Paid Leave
Now when i pass the from and to date as 01-Jan-2020 and 31-Jan-2020, the output should look like -
Person_Number Double Regular Hour_code hour_amount
73636 31.9 37 Paid Leave 10
The hour_code should have only "Paid Leave" and no other absences
Now I have written the below query for this
SELECT
distinct person_number,
sum(
CASE
WHEN elements = 'Double' THEN measure
END
) AS OT_Hours,
sum(
CASE
WHEN elements LIKE 'Regular Pay%' THEN measure
END
) AS regular_measure_hours,
sum(
CASE
WHEN absence_name IN ('Paid Leave') THEN absence_duration
END
) AS hour3_amount,
max(
CASE
WHEN absence_name IN ('Paid Leave') THEN 'Paid Leave'
END
) AS hour3_code
FROM
(
select
person_number,
Time_type elements,
Absence_type absence_name,
duration,
measure
from
time_track_tab,
abs_tab,
per_all_people_F papf
where
time_track_tab.person_id = abs_tab.person_id
and abs_tab.person_id = papf.person_id
and abs_tab.Absence_type = 'Paid Leave'
)
group by
person_number
This is giving me multiple row output and calculation of sum is not coming correctly as in between the to and from date there are different dates present for both absence and time track.
My requirement is to calculate the sum of ALL the duration and measure column within these parameter dates. How can i tweak my query to get the correct sum between these dates ?
Is there a way to use partition by or group by or anything else to calculation these correctly in the column

You probably need to group both tables first then join them together to avoid the cross join.
select person_number, TimeTrack.DoublePay, TimeTrack.Regular,
Absenses.Hour_code, Absenses.hour_amount from
per_all_people_F papf,
(select
person_id, sum(duration) as hour_amount, Absence_type as Hour_code
from
abs_tab
where
abs_tab.Absence_type = 'Paid Leave'
and
start_Date between '2020-01-01' and '2020-01-31'
group by person_id,Absence_type
) Absenses,
(select
person_id,
sum(case when Time_type = 'Double' then Measure end) as DoublePay,
sum(case when Time_type = 'Regular Pay' then Measure end) as Regular
from time_track_tab
where
start_Date between '2020-01-01' and '2020-01-31'
group by person_id
) TimeTrack
where
papf.person_id = TimeTrack.person_id
and
papf.person_id = Absenses.person_id
and
papf.person_id = 73636
I made a SqlFiddle if you want to play with it
http://sqlfiddle.com/#!9/03e460/36
Also my 2 cents; I'd recommend left outer joining from the per_all_people_F table or else people without absenses will get filtered out.

See if, what you need is something like this:
select * from
(SELECT person_number,
sum(
CASE
WHEN Time_type = 'Double' THEN measure
END
) AS Double,
sum(
CASE
WHEN Time_type = ('Regular Pay') THEN measure
END
) AS regular
from time_track_tab
group by person_number
) A
inner join
(SELECT
person_number,
sum(
CASE
WHEN Absence_type = 'Vacation' THEN duration
END
) AS Vacation,
sum(
CASE
WHEN Absence_type = ('Paid Leave') THEN duration
END
) AS paidLeave
from abs_tab
group by person_number
)B on A.person_number = B.person_number
here the fiddle:
http://sqlfiddle.com/#!4/21253/2

Related

SQL query to get the data as of 31/03 of past three years

I have created the above query which give me the Bonus tagged to the person as of sysdate.
select
person_number ,
peef.effective_start_Date,
peef.value Amount
from
per_all_people_f papf,
pay_element_entries_f peef
where
papf.person_id = peef.person_id
and PEEF.element_name in ('Bonus')
and sysdate between peef.effective_start_Date and peef.effective_end_Date
I want to tweak the above query to get the Amount for past three years as of 31/3 i.e. instead of sysdate as of 31/3/2021, 31/03/2020,31/03/2019
Output like -
Person_NUMBER effective_start_Date current_Amount 2021_AMOUNT 2020_AMOUNT 2019_AMOUNT
How can i tweak the same query and change the sysdate condition to look for data for past three years as well for the 2021_amount, 2020_amount and 2019_amount column
You can move the last line to the SELECT list as conditionals such as
SELECT person_number, peef.effective_start_Date,
peef.value AS current_amount,
CASE WHEN date'2021-03-31' BETWEEN peef.effective_start_Date AND peef.effective_end_Date
THEN peef.value
END AS 2021_amount,
CASE WHEN date'2020-03-31' BETWEEN peef.effective_start_Date AND peef.effective_end_Date
THEN peef.value
END AS 2020_amount,
CASE WHEN date'2019-03-31' BETWEEN peef.effective_start_Date AND peef.effective_end_Date
THEN peef.value
END AS 2019_amount
FROM per_all_people_f papf
JOIN pay_element_entries_f peef
ON peef.person_id = papf.person_id
WHERE peef.element_name = 'Bonus'
in order to get the currently displayed result, but in this case there will be multiple lines with NULL values for amount columns. I suspect that you need to get aggregated results in order to show the summed up amounts, then consider adding SUM() aggregation such as
SELECT person_number, peef.effective_start_Date,
SUM( peef.value ) AS current_amount,
SUM( CASE WHEN date'2021-03-31' BETWEEN peef.effective_start_Date AND peef.effective_end_Date
THEN peef.value
END ) AS 2021_amount,
SUM( CASE WHEN date'2020-03-31' BETWEEN peef.effective_start_Date AND peef.effective_end_Date
THEN peef.value
END ) AS 2020_amount,
SUM( CASE WHEN date'2019-03-31' BETWEEN peef.effective_start_Date AND peef.effective_end_Date
THEN peef.value
END ) AS 2019_amount
FROM per_all_people_f papf
JOIN pay_element_entries_f peef
ON peef.person_id = papf.person_id
WHERE peef.element_name = 'Bonus'
GROUP BY person_number, peef.effective_start_Date

How to solve a nested aggregate function in SQL?

I'm trying to use a nested aggregate function. I know that SQL does not support it, but I really need to do something like the below query. Basically, I want to count the number of users for each day. But I want to only count the users that haven't completed an order within a 15 days window (relative to a specific day) and that have completed any order within a 30 days window (relative to a specific day). I already know that it is not possible to solve this problem using a regular subquery (it does not allow to change subquery values for each date). The "id" and the "state" attributes are related to the orders. Also, I'm using Fivetran with Snowflake.
SELECT
db.created_at::date as Date,
count(case when
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-15,Date) and dateadd(day,-1,Date)) then db.id end)
= 0) and
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-30,Date) and dateadd(day,-16,Date)) then db.id end)
> 0) then db.user end)
FROM
data_base as db
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
In other words, I want to transform the below query in a way that the "current_date" changes for each date.
WITH completed_15_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-15,current_date) and dateadd(day,-1,current_date)
group by User
),
completed_16_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-30,current_date) and dateadd(day,-16,current_date)
group by User
)
SELECT
date(db.created_at) as Date,
count(distinct case when comp_15.completadas = 0 and comp_16.completadas > 0 then comp_15.user end) as "Total Users Churn",
count(distinct case when comp_15.completadas > 0 then comp_15.user end) as "Total Users Active",
week(Date) as Week
FROM
data_base as db
left join completadas_15_days_before as comp_15 on comp_15.user = db.user
left join completadas_16_days_before as comp_16 on comp_16.user = db.user
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
Does anyone have a clue on how to solve this puzzle? Thank you very much!
The following should give you roughly what you want - difficult to test without sample data but should be a good enough starting point for you to then amend it to give you exactly what you want.
I've commented to the code to hopefully explain what each section is doing.
-- set parameter for the first date you want to generate the resultset for
set start_date = TO_DATE('2020-01-01','YYYY-MM-DD');
-- calculate the number of days between the start_date and the current date
set num_days = (Select datediff(day, $start_date , current_date()+1));
--generate a list of all the dates from the start date to the current date
-- i.e. every date that needs to appear in the resultset
WITH date_list as (
select
dateadd(
day,
'-' || row_number() over (order by null),
dateadd(day, '+1', current_date())
) as date_item
from table (generator(rowcount => ($num_days)))
)
--Create a list of all the orders that are in scope
-- i.e. 30 days before the start_date up to the current date
-- amend WHERE clause to in/exclude records as appropriate
,order_list as (
SELECT created_at, rt_id
from data_base
where created_at between dateadd(day,-30,$start_date) and current_date()
and state = 'finished'
)
SELECT dl.date_item
,COUNT (DISTINCT ol30.RT_ID) AS USER_COUNT
,COUNT (ol30.RT_ID) as ORDER_COUNT
FROM date_list dl
-- get all orders between -30 and -16 days of each date in date_list
left outer join order_list ol30 on ol30.created_at between dateadd(day,-30,dl.date_item) and dateadd(day,-16,dl.date_item)
-- exclude records that have the same RT_ID as in the ol30 dataset but have a date between 0 amd -15 of the date in date_list
WHERE NOT EXISTS (SELECT ol15.RT_ID
FROM order_list ol15
WHERE ol30.RT_ID = ol15.RT_ID
AND ol15.created_at between dateadd(day,-15,dl.date_item) and dl.date_item)
GROUP BY dl.date_item
ORDER BY dl.date_item;

Sum of 2 selects statement in sql with different where Clause

I am trying to get the sum of 2 columns in one table but with different where the condition, the only difference is the amount per department is calculated based on 17% Margin.
The Result should be the total revenue grouped by Event Name and Event ID.
for a sql Report, I have written 2 sql statements with different conditions and got the correct value for 2 columns but separately, i have summed both in a way but it was for one event.
SELECT EVT_ID, Event_Desc, Sum(Order_Total) as Total + (Select SUm(Order_Total *0.17) as Total from Orders Join Events EM On OrD.EVT_ID = EV.EVENTS_ID
where EVT_START_DATE between '2019-01-01' and '2019-01-31' Order_Department = 'FAB' )
From Orders Join Events EM On OrD.EVT_ID = EV.EVENTS_ID
where EVT_START_DATE between '2019-01-01' and '2019-01-31' Order_Department <> 'FAB'
Group by EVT_ID, Event_Desc
select EVT_ID, Event_Desc, sum(Total)as Total
from
(
SELECT EVT_ID, Event_Desc, Sum(Order_Total) as Total
From Orders
Join Events EM On OrD.EVT_ID = EV.EVENTS_ID
where EVT_START_DATE between '2019-01-01' and '2019-01-31' and Order_Department <> 'FAB'
Group by EVT_ID, Event_Desc
union
Select EVT_ID, Event_Desc, SUm(Order_Total *0.17) as Total
from Orders
Join Events EM On OrD.EVT_ID = EV.EVENTS_ID
where EVT_START_DATE between '2019-01-01' and '2019-01-31' and Order_Department = 'FAB' ) tbl
Group by EVT_ID, Event_Desc
OR
SELECT EVT_ID, Event_Desc, Sum(case when Order_Department = 'FAB' then Order_Total else Order_Total *0.17 end ) as Total
From Orders
Join Events EM On OrD.EVT_ID = EV.EVENTS_ID
where EVT_START_DATE between '2019-01-01' and '2019-01-31'
Group by EVT_ID, Event_Desc
If I followed you correctly, you could approach this with conditional aggregation. You can use a CASE construct within the SUM aggregate function to check to which departement the current record belongs to and do the computation accordingly.
SELECT
o.evt_id,
event_desc,
SUM(CASE
WHEN order_department = 'FAB' THEN order_total * 0.17
ELSE order_total END
) AS Total
FROM orders o
INNER JOIN events e On o.evt_id = e.events_id
WHERE evt_start_date BETWEEN '2019-01-01' and '2019-01-31'
GROUP BY
o.evt_id,
event_desc
NB: most columns in your query are not prefixed with a table alias, making it unclear from which table they come from. I added them when it was possible to make an educated guess from your sql code, and I would higly recommend that you add prefixes to all of the remaining.

COUNT from DISTINCT values in multiple columns

If this has been asked before, I apologize, I wasn't able to find a question/solution like it before breaking down and posting. I have the below query (using Oracle SQL) that works fine in a sense, but not fully what I'm looking for.
SELECT
order_date,
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END AS issue_group,
srt AS srt_level,
COUNT(*) AS total_orders
FROM
database.t_con
WHERE
order_date IN (
'&Enter_Date_YYYYMM'
)
GROUP BY
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END,
srt,
order_date
ORDER BY
p_category,
issue_group,
srt_level,
order_date
Current Return (12 rows):
Needed Return (8 rows without the tan rows being shown):
Here is the logic of total_order column that I'm expecting:
count of order_date where (srt_level = 80 + 100 + Late) ... 'Late' counts needed to be added to the total, just not be displayed
I'm eventually adding a filled_orders column that will go before the total_orders column, but I'm just not there yet.
Sorry I wasn't as descriptive earlier. Thanks again!
You don't appear to need a subquery; if you want the count for each combination of values then group by those, and aggregate at that level; something like:
SELECT
t1.order_date,
t1.p_category,
CASE
WHEN ( t1.issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END AS issue_group,
t1.srt AS srt_level,
COUNT(*) AS total_orders
FROM
database.t_con t1
WHERE
t1.order_date = TO_DATE ( '&Enter_Date_YYYYMM', 'YYYYMM' )
GROUP BY
t1.p_category,
CASE
WHEN ( t1.issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END,
t1.srt,
t1.order_date
ORDER BY
p_category,
issue_group,
srt_level,
order_date;
You shouldn't be relying on implicit conversion and NLS settings for your date argument (assuming order_date is actually a date column, not a string), so I've used an explicit TO_DATE() call, using the format suggested by your substitution variable name and prompt.
However, that will give you the first day of the supplied month, since a day number isn't being supplied. It's more likely that you either want to prompt for a full date, or (possibly) just the year/month but want to include all days in that month - which IN() will not do, if that was your intention. It also implies that stored dates all have their time portions set to midnight, as that is all it will match on. If those values have non-midnight times then you need a range to pick those up too.
I got it working to the extent of what my question was. Just needed to nest each column where counts/calculations were happening.
SELECT
order_date,
p_category,
issue_group,
srt_level,
order_count,
SUM(order_count) OVER(
PARTITION BY order_date, issue_group, p_category
) AS total_orders
FROM
(
SELECT
order_date,
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END AS issue_group,
srt AS srt_level,
COUNT(*) AS order_count
FROM
database.t_con
WHERE
order_date IN (
'&Enter_Date_YYYYMM'
)
GROUP BY
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END,
srt,
order_date
)
ORDER BY
order_date,
p_category,
issue_group

SQL query that does two GROUP BYs?

I'm having trouble getting the SQL for a report I need to generate. I've got the (equivalent of the) following setup:
A table articles (fields such as as name, manufacturer_id, etc).
A table sales.
FK to articles called article_id
An integer called amount
A date field
A field called type. We can assume it is a string and it can have 3 known values - 'morning', 'evening' and 'night'
I want to generate an aggregated sales report, given a start date and end date:
+--------------+---------------+--------------+-------------+
| article_name | morning_sales | evening_sales| night_sales |
+--------------+---------------+--------------+-------------+
| article 1 | 0 | 12 | 2 |
| article 2 | 80 | 3 | 0 |
... ... ... ... ...
| article n | 37 | 12 | 1 |
+--------------+---------------+--------------+-------------+
I'd like to do it as efficiently as possible. So far I've been able to generate a query that will give me one type of sale (morning, evening or night) but I'm not able to do it for multiple types simultaneously. Is it even possible?
This is what I have so far; it'll give me the article name and morning sales of all the articles in a given period - in other words, the first two columns of the report:
SELECT articles.name as article_name,
SUM(sales.amount) as morning_sales,
FROM "sales"
INNER JOIN articles ON articles.id = sales.articles_id
WHERE ( sales.date >= '2011-05-09'
AND sales.end_date <= '2011-05-16'
AND sales.type = 'morning'
)
GROUP BY sales.article_id
I guess I could do the same for evening and night, but the articles will be different; some articles might have sales in the morning but not in the evening, for example.
If I have to do 1 request per sale type, how do I "mix and match" the different article lists I'll get?
Is there a better way do do this (maybe with subqueries of some sort)?
Similarly, I'm able to build a query that gives me all the morning, evening and night sales, grouping by type. I guess my problem is that I need to do two GROUP BYs in order to get this report. I don't know how to do that, if it's possible at all.
I'm using PostgreSQL as my DB, but I'd like to stay as standard as possible. If it helps, the app using this is a Rails app.
Fortunately, you don't need to do multiple queries with your database format. This should work for you:
SELECT
articles.name AS article_name
SUM(IF(sales.type = 'morning', sales.amount, 0.0)) AS morning_sales,
SUM(IF(sales.type = 'evening', sales.amount, 0.0)) AS evening_sales,
SUM(IF(sales.type = 'night', sales.amount, 0.0)) AS night_sales
FROM sales
JOIN articles ON sales.article_id = articles.id
WHERE
sales.date >= "2011-01-01 00:00:00"
AND sales.date < "2011-02-01 00:00:00"
GROUP BY sales.article_id
And if there are other types, you would have to add more columns there, OR simply sum up the other types by adding this to the SELECT clause:
SUM(
IF(sales.type IS NULL OR sales.type NOT IN ('morning', 'evening', 'night'),
sales.amount, 0.0
)
) AS other_sales
The above is compatible with MySQL. To use it in Postgres, I think you'd have to change the IF expressions to CASE expressions, which should look like this (untested):
SELECT
articles.name AS article_name,
SUM(CASE WHEN sales.type = 'morning' THEN sales.amount ELSE 0.0 END) AS morning_sales,
SUM(CASE WHEN sales.type = 'evening' THEN sales.amount ELSE 0.0 END) AS evening_sales,
SUM(CASE WHEN sales.type = 'night' THEN sales.amount ELSE 0.0 END) AS night_sales
FROM sales
JOIN articles ON sales.article_id = articles.id
WHERE
sales.date >= "2011-01-01 00:00:00"
AND sales.date < "2011-02-01 00:00:00"
GROUP BY sales.article_id
Two options.
Option 1. A single join with computed columns for agreggation
select article_name = a.article_name ,
morning_sales = sum( case when sales.type = 'morning' then sales.amount end ) ,
evening_sales = sum( case when sales.type = 'evening' then sales.amount end ) ,
night_sales = sum( case when sales.type = 'night' then sales.amount end ) ,
other_sales = sum( case when sales.type in ( 'morning','evening','night') then null else sales.amount end ) ,
total_sales = sum( sales.amount )
from articles a
join sales s on s.articles_id = a.articles_id
where sales.date between #dtFrom and #dtThru
group by a.article_name
Option 2. multiple left joins
select article_name = a.article_name ,
morning_sales = sum( morning.amount ) ,
evening_sales = sum( evening.amount ) ,
night_sales = sum( night.amount ) ,
other_sales = sum( other.amount ) ,
total_sales = sum( total.amount )
from articles a
left join sales morning on morning.articles_id = a.articles_id
and morning.type = 'morning'
and morning.date between #dtFrom and #dtThru
left join sales evening on evening.articles_id = a.articles_id
and evening.type = 'evening'
and evening.date between #dtFrom and #dtThru
left join sales night on night.articles_id = a.articles_id
and night.type = 'evening'
and night.date between #dtFrom and #dtThru
left join sales other on other.articles_id = a.articles_id
and ( other.type is null
OR other.type not in ('morning','evening','night')
)
and other.date between #dtFrom and #dtThru
left join sales total on total.articles_id = a.articles_id
and total.date between #dtFrom and #dtThru
group by a.article_name