SQL query to aggregate month in one table

SQL query to aggregate month in one table - sql

I have a table with one column (TK) with multiple values, also duplicated and another one column with date.
I need to return a table with first column with distinct(TK) and the other columns like month.
I do an example into SQL FIDDLE
http://sqlfiddle.com/#!18/14cb9f/28
TK
JANUARY
open a
4
open B
4
TK
FEBRUARY
open a
4
open B
4
I need
TK
JANUARY
FEBRUARY
open a
4
4
open B
4
4
Thanks

A simple conditional aggregation should do the trick
SELECT TK
,Janary = sum( case when month(datastart)=1 then 1 else 0 end )
,February = sum( case when month(datastart)=2 then 1 else 0 end )
From TEST
Where year(datastart)=2021
Group By TK
Or you can use PIVOT
Select *
From (
Select TK
,Col = datename(month,DataStart)
,Val = 1
From TEST
Where year(datastart)=2021
) src
Pivot ( sum(Val) for Col in ([January] ,[February] ) ) pvt

There are multiple ways to do this, but avoiding sub-queries and making the syntax simple to read, this is the simplest I can get:
SELECT
TK,
SUM(
CASE WHEN DATASTART >= '2021-01-01' AND DATASTART < '2021-02-01' THEN 1 ELSE 0 END
) AS JENUARY,
SUM(
CASE WHEN DATASTART >= '2021-02-01' AND DATASTART <= '2021-02-28' THEN 1 ELSE 0 END
) AS FEBRUARY
FROM
Test
GROUP BY
TK
Check it out
http://sqlfiddle.com/#!18/14cb9f/34

Related

CASE WHEN condition with MAX() function

There are a lot questions on CASE WHEN topic, but the closest my question is related to this How to use CASE WHEN condition with MAX() function query which has not been resolved.
Here is some of my sample data:
date
debet
2022-07-15
57190.33
2022-07-14
815616516.00
2022-07-15
40866.67
2022-07-14
1221510.00
So, I want to all records for the last two dates and three additional columns: sum(sales) for the previous day, sum for the current day and the difference between them:
SELECT
[debet],
[date] ,
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END ) AS sum_act,
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END ) AS sum_prev ,
(
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END )
-
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END )
) AS diff
FROM
Table
WHERE
[date] = ( SELECT MAX(date) FROM Table WHERE date < ( SELECT MAX(date) FROM Table) )
OR
[date] = ( SELECT MAX(date) FROM Table WHERE date = ( SELECT MAX(date) FROM Table ) )
GROUP BY
[date],
[debet]
Further, of course, it informs that I can't use the aggregate function inside CASE WHEN. Now I use this combination: sum(CASE WHEN [date] = dateadd(dd,-3,cast(getdate() as date)) THEN [debet] ELSE 0 END). But here every time I need to make an adjustment for weekends and holidays. The question is, is there any other way than using 'getdate' in 'case when' Statement to get max date?
Expected result:
date
sum_act
sum_prev
diff
2022-07-15
97190.33
0.00
97190.33
2022-07-14
0.00
508769.96
-508769.96

You can use dense_rank() to filter the last 2 dates in your table. After that you can use either conditional case expression with sum() to calculate the required value
select [date],
sum_act = sum(case when rn = 1 then [debet] else 0 end),
sum_prev = sum(case when rn = 2 then [debet] else 0 end),
diff = sum(case when rn = 1 then [debet] else 0 end)
- sum(case when rn = 2 then [debet] else 0 end)
from
(
select *, rn = dense_rank() over (order by [date] desc)
from tbl
) t
where rn <= 2
group by [date]
db<>fiddle demo

Two steps:
Get the sums for the last three dates
Show the results for the last two dates.
Well, we could also get all daily sums in step 1, but we just need the last three in order to calculate the sums for the last two days, so why aggregate more data than necessary?
Here is the query. You may have to put the date column name in brackets in SQL Server, as date is a keyword in SQL.
select top(2)
date,
sum_debit_current,
sum_debit_previous,
sum_debit_current - sum_debit_previous as diff
(
select
date,
sum(debet) as sum_debit_current,
lag(sum(debet)) over (order by date) as sum_debit_previous
from table
where date in (select distinct top(3) date from table order by date desc)
group by date
)
order by date desc;
(SQL Server uses TOP(n) instead of standard SQL FETCH FIRST 3 ROWS and while SELECT DISTINCT TOP(3) date looks like "get the top 3 rows, then apply distinct on their date", it is really "apply distinct on the dates, then get the top 3" like in standard SQL.)

SQL - Group data with same ID and Date that has been to every Machine but has a different Name

I am trying to create a query that will group data by CT ID and Date that have all 3 MachineID's (1, 10, and 20) and at least one different Sawing Pattern Name.
This Image shows a highlighted example of the data I'm trying to get back and the code i'm currently using
I'm trying to only show data similar to the highlighted rows in the image (CT ID 501573833) and exclude the data in the rows around it where the Sawing Pattern Name is the same at all 3 MachineID's.

Your description suggests group by and having. The conditions you describe can all go in the having clause:
select ct_id, date
from t
group by ct_id, date
having sum(case when machineid = 1 then 1 else 0 end) > 0 and
sum(case when machineid = 10 then 1 else 0 end) > 0 and
sum(case when machineid = 20 then 1 else 0 end) > 0 and
min(sawing_pattern_name) <> max(sawing_pattern_name)

Seems to me that an EXISTS could be useful here.
SELECT
[CT ID],
[MachineID],
[Sawing Pattern name],
[Time],
CAST([Time] AS DATE) AS [Date]
FROM [DataCollector].[dbo].[Maxicut] t
WHERE EXISTS
(
SELECT 1
FROM [DataCollector].[dbo].[Maxicut] d
WHERE d.[CT ID] = t.[CT ID]
AND CAST(d.[Time] AS DATE) = CAST(t.[Time] AS DATE)
AND d.[MachineID] != t.[MachineID]
AND REPLACE(d.[Sawing Pattern name],',','') != REPLACE(t.[Sawing Pattern name],',','')
);

SQL find consecutive days of specific threshold reached

I have two columns; the_day and amount_raised. I want to find the count of consecutive days that at least 1 million dollars was raised. Am I able to do this in SQL? Ideally, I'd like to create a column that counts the consecutive days and then starts over if the 1 million dollar threshold is not reached.
What I've done thus far is create a third column that puts a 1 in the row if 1 million was reached. Could I create a subquery and count the consecutive 1's listed, then reset when it hits 0?
and here is the desired output

select dt,amt,
case when amt>=1000000 then -1+row_number() over(partition by col order by dt)
else 0 end col1
from (select *, sum(case when amt >= 1000000 then 0 else 1 end) over(order by dt) col
from t) x
Sample Demo

SELECT the_day,
amount_raised,
million_threshold,
CASE WHEN million_threshold <> lag_million_threshold AND million_threshold = lead_million_threshold
THEN 1
WHEN million_threshold = lag_million_threshold
THEN SUM(million_threshold) OVER ( ORDER BY the_day ROWS UNBOUNDED PRECEDING )
ELSE 0
END AS consecutive_day_cnt
FROM
(
SELECT the_day,
amount_raised,
million_threshold,
LAG(million_threshold,1) OVER ( ORDER BY the_day ) AS lag_million_threshold,
LEAD(million_threshold,1) OVER ( ORDER BY the_day ) AS lead_million_threshold
FROM
(
SELECT the_day,
amount_raised,
CASE WHEN amount_raised >= 1000000
THEN 1
ELSE 0
END AS million_threshold
FROM Yourtable
)
);

How to count every half hour?

I have a query that its counting every hour, using a pivot table.
How would it be possible to get the count for every 30 minutes?
for example 8:00-8:29,8:30-8:59,9:00-9:29 etc. until 5:00
SELECT CONVERT(varchar(8),start_date,1) AS 'Day',
SUM(CASE WHEN DATEPART(hour,start_date) = 8 THEN 1 ELSE 0 END) as eight ,
SUM(CASE WHEN DATEPART(hour,start_date) = 9 THEN 1 ELSE 0 END) AS nine,
SUM(CASE WHEN DATEPART(hour,start_date) = 10 THEN 1 ELSE 0 END) AS ten,
SUM(CASE WHEN DATEPART(hour,start_date) = 11 THEN 1 ELSE 0 END) AS eleven,
SUM(CASE WHEN DATEPART(hour,start_date) = 12 THEN 1 ELSE 0 END) AS twelve,
SUM(CASE WHEN DATEPART(hour,start_date) = 13 THEN 1 ELSE 0 END) AS one_clock,
SUM(CASE WHEN DATEPART(hour,start_date) = 14 THEN 1 ELSE 0 END) AS two_clock,
SUM(CASE WHEN DATEPART(hour,start_date) = 15 THEN 1 ELSE 0 END) AS three_clock,
SUM(CASE WHEN DATEPART(hour,start_date) = 16 THEN 1 ELSE 0 END) AS four_clock
FROM test
where user_id is not null
GROUP BY CONVERT(varchar(8),start_date,1)
ORDER BY CONVERT(varchar(8),start_date,1)
I use sql server 2012 (version Microsoft SQL Server Management Studio 11.0.3128.0)

Try using iif as below:
SELECT CONVERT(varchar(8),start_date,1) AS 'Day', SUM(iif(DATEPART(hour,start_date) = 8 and
DATEPART(minute,start_date) >= 0 and
DATEPART(minute,start_date) =< 29,1,0)) as eight_tirty
FROM test where user_id is not null GROUP BY
CONVERT(varchar(8),start_date,1) ORDER BY
CONVERT(varchar(8),start_date,1)

To get counts by day and half hour, something like this should work.
SELECT day, half_hour, count(1) AS half_hour_count
FROM (
SELECT
CAST(start_date AS date) AS day,
DATEPART(hh, start_date)
+ 0.5*(DATEPART(n,start_date)/30) AS half_hour
FROM test
WHERE user_id IS NOT NULL
) qry
GROUP BY day, half_hour
ORDER BY day, half_hour;
Formatting the result could be done later.

You need a few things, and then this query just falls together.
First, assuming you need multiple dates, you're going to want what's known as a Calendar Table (hands down, probably the most useful analysis table).
Next, you're going to want either an existing Numbers table if you have one, or just generate the first on the fly:
WITH Halfs AS (SELECT CAST(0 AS INT) m
UNION ALL
SELECT m + 1
FROM Halfs
WHERE m < 24 * 2)
SELECT m
FROM Halfs
(recursive CTE - generates a table with a list of numbers starting at 0).
These two tables will provide the basis for a range query based on the timestamps in your main table. This will make it very easy for the optimizer to bucket rows for whatever aggregation you're doing. That's done by CROSS JOINing the two tables together in a subquery, as well as adding a couple of other derived columns:
WITH Halfs AS (SELECT CAST(0 AS INT) m
UNION ALL
SELECT m + 1
FROM Halfs
WHERE m < 24 * 2)
SELECT calendarDate, m, rangeStart, rangeEnd
FROM (SELECT Calendar.calendarDate, Halfs.m rangeGroup,
DATEADD(minutes, m * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeStart,
DATEADD(minutes, (m + 1) * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeEnd
FROM Calendar
CROSS JOIN Halfs
WHERE Calendar.calendarDate >= CAST('20160823' AS DATE)
AND Calendar.calendarDate < CAST('20160830' AS DATE)
-- OR whatever your date range actually is.
) Range
ORDER BY rangeStart
(note that, if the range of dates is sufficiently large, it may be beneficial to save this off as a temporary table with indicies. For small tables and datasets, the performance gain isn't likely to be noticeable)
Now that we have our ranges, it's trivial to get our groups, and pivot the table.
Oh, and SQL Server has a specific operator for PIVOTing.
WITH Halfs AS (SELECT CAST(0 AS INT) m
UNION ALL
SELECT m + 1
FROM Halfs
WHERE m < 3 * 2)
-- Intentionally limiting range for example only
SELECT calendarDate AS day, [0], [1], [2], [3], [4], [5], [6]
-- If you're displaying "nice" names,
-- do it at this point, or in the reporting application
FROM (SELECT Range.calendarDate, Range.rangeGroup
FROM (SELECT Calendar.calendarDate, Halfs.m rangeGroup,
DATEADD(minutes, m * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeStart,
DATEADD(minutes, (m + 1) * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeEnd
FROM Calendar
CROSS JOIN Halfs
WHERE Calendar.calendarDate >= CAST('20160823' AS DATE)
AND Calendar.calendarDate < CAST('20160830' AS DATE)
-- OR whatever your date range actually is.
) Range
LEFT JOIN Test
ON Test.user_id IS NOT NULL
AND Test.start_date >= Range.rangeStart
AND Test.start_date < Range.rangeEnd
) AS DataTable
PIVOT (COUNT(*)
FOR Range.rangeGroup IN ([0], [1], [2], [3], [4], [5], [6])) AS PT
-- Only covers the first 6 groups,
-- or the first three hours.
ORDER BY day
The pivot should take care of the getting individual columns, and COUNT will automatically resolve null rows. Should be all you need.

change rows to columns and count

how to calculate count based on rows?
SOURCE TABLE
each employee can take 2 days off
Employee-----First_Day_Off-----Second_Day_Off
1------------10/21/2009--------12/6/2009
2------------09/3/2009--------12/6/2009
3------------09/3/2009--------NULL
4
5
.
.
.
Now i need a table that shows the dates and number of people taking off on that day
Date---------First_Day_Off-------Second_Day_Off
10/21/2009---1-------------------0
12/06/2009---1--------------------1
09/3/2009----2--------------------0
Any ideas?

Oracle 9i+, using Subquery Factoring (WITH):
WITH sample AS (
SELECT a.employee,
a.first_day_off AS day_off,
1 AS day_number
FROM YOUR_TABLE a
WHERE a.first_day_off IS NOT NULL
UNION ALL
SELECT b.employee,
b.second_day_off,
2 AS day_number
FROM YOUR_TABLE b
WHERE b.second_day_off IS NOT NULL)
SELECT s.day_off AS date,
SUM(CASE WHEN s.day_number = 1 THEN 1 ELSE 0 END) AS first_day_off,
SUM(CASE WHEN s.day_number = 2 THEN 1 ELSE 0 END) AS second_day_off
FROM sample s
GROUP BY s.day_off
Non Subquery Version
SELECT s.day_off AS date,
SUM(CASE WHEN s.day_number = 1 THEN 1 ELSE 0 END) AS first_day_off,
SUM(CASE WHEN s.day_number = 2 THEN 1 ELSE 0 END) AS second_day_off
FROM (SELECT a.employee,
a.first_day_off AS day_off,
1 AS day_number
FROM YOUR_TABLE a
WHERE a.first_day_off IS NOT NULL
UNION ALL
SELECT b.employee,
b.second_day_off,
2 AS day_number
FROM YOUR_TABLE b
WHERE b.second_day_off IS NOT NULL) s
GROUP BY s.day_off

It is a bit awkward to handle these queries, since you have days off stored in different columns. A better layout would be to have something like
EMPLOYEE_ID DAY_OFF
Then you would have multiple rows if an employee took multiple days off
EMPLOYEE_ID DAY_OFF
1 10/21/2009
1 12/6/2009
2 09/3/2009
2 12/6/2009
3 09/3/2009
...
In that case, you could find out how many days off each person took by using the following query:
SELECT EMPLOYEE_ID, COUNT(*) AS NUM_DAYS_OFF FROM DAYS_OFF_TABLE GROUP BY EMPLOYEE_ID
And the number of people who took days off on each date like this:
SELECT DAY_OFF, COUNT(*) AS NUM_PEOPLE FROM DAYS_OFF_TABLE GROUP BY DAY_OFF
But I digress...
You can try to use an SQL CASE statement to help with this:
SELECT Employee, CASE
WHEN First_Day_Off is NULL AND Second_Day_Off is NULL THEN 0
WHEN First_Day_Off is NOT NULL AND Second_Day_Off is NULL THEN 1
WHEN First_Day_Off is NULL AND Second_Day_Off is NOT NULL THEN 1
ELSE 2
END AS NUM_DAYS_OFF
FROM DAYS_OFF_TABLE
(note that you may need to change around the syntax slightly depending on your database.
Getting dates and number of people who took off on that day might be more complicated.
I don't know if this would work, but you can try it:
SELECT
Date_Off,
COUNT(*) AS Num_People
FROM
(SELECT
First_Day_Off, COUNT(*) AS Num_People FROM DAYS_OFF_TABLE WHERE First_Day_Off IS NOT NULL GROUP BY First_Day_Off
UNION
SELECT Second_Day_Off, COUNT(*) AS Num_People FROM DAYS_OFF_TABLE WHERE Second_Day_Off IS NOT NULL GROUP BY Second_Day_Off)
GROUP BY
Num_People

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query to aggregate month in one table - sql

Related

CASE WHEN condition with MAX() function

SQL - Group data with same ID and Date that has been to every Machine but has a different Name

SQL find consecutive days of specific threshold reached

How to count every half hour?

change rows to columns and count

Categories

Resources