How to take the smallest date of a group? - sql

I have a dataset which looks like that:
| id | status | open_date | name |
| 8 | active | 2019-3-2 | blab |
| 8 | active | 2019-3-8 | blub |
| 8 | inactive | 2019-3-9 | hans |
| 8 | active | 2019-3-10 | ana |
| 9 | active | 2019-3-4 | mars |
I want to achieve the following:
| id | status | open_date | name | status_change_date |
| 8 | active | 2019-3-2 | blab | 2019-3-2
| 8 | active | 2019-3-8 | blub | 2019-3-2
| 8 | inactive | 2019-3-9 | Hans | 2019-3-9
| 8 | active | 2019-3-10 | ana | 2019-3-10
| 9 | active | 2019-3-4 | mars | 2019-3-4
for each id I like to calculate when the status has last changed
I already tried with groupBy, but the problem is I only want to group by the rows with Active and Inactive which are next to each other. If there is an INACTIVE between ACTIVE I like to make a new group for the new ACTIVE.
Someone has an idea to solve that?

Here is a pure SQL solution that uses window functions. This works by generating a partition that contains consecutive records that have the same id and status.
SELECT
id,
status,
open_date,
name,
MIN(open_date) OVER(PARTITION BY id, rn1 - rn2 ORDER BY open_date) status_change_date
FROM (
SELECT
t.*,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY open_date) rn1,
ROW_NUMBER() OVER(PARTITION BY id, status ORDER BY open_date) rn2
FROM mytable t
) x
ORDER BY id, open_date
Demo on DB Fiddle:
| id | status | open_date | name | status_change_date |
| --- | -------- | ---------- | ---- | ------------------ |
| 8 | active | 2019-03-02 | blab | 2019-03-02 |
| 8 | active | 2019-03-08 | blub | 2019-03-02 |
| 8 | inactive | 2019-03-09 | hans | 2019-03-09 |
| 8 | active | 2019-03-10 | ana | 2019-03-10 |
| 9 | active | 2019-03-04 | mars | 2019-03-04 |

Thats the answer on How to take the smallest date of a group?
let minDate = new Date('0001-01-01T00:00:00Z');
dataset.forEach(x => if( x.date > this.minDate) { this.minDate = x.date } )
console.log(this.minDate);

You can try this:
var movies = [
{title: 'The Godfather', rating: 9.2, release: '24 March 1972'},
{title: 'The Godfather: Part II', rating: 9.0, release: '20 December 1972'},
{title: 'The Shawshank Redemption', rating: 9.3, release: '14 October 1994'},
];
movies.sort(function(a, b) {
var dateA = new Date(a.release), dateB = new Date(b.release);
return dateA - dateB;
});
This sortby works because js lets you compare arithmetic on date objects, which are automatically converted to numeric representations first.
In SQL use MIN function:
ORDER
Id
OrderDate
OrderNumber
CustomerId
TotalAmount
SELECT MIN(OrderDate)
FROM [Order]
WHERE YEAR(OrderDate) = 2013

Related

Combine PARTITION BY and GROUP BY

I have a (mssql) table like this:
+----+----------+---------+--------+--------+
| id | username | date | scoreA | scoreB |
+----+----------+---------+--------+--------+
| 1 | jim | 01/2020 | 100 | 0 |
| 2 | max | 01/2020 | 0 | 200 |
| 3 | jim | 01/2020 | 0 | 150 |
| 4 | max | 02/2020 | 150 | 0 |
| 5 | jim | 02/2020 | 0 | 300 |
| 6 | lee | 02/2020 | 100 | 0 |
| 7 | max | 02/2020 | 0 | 200 |
+----+----------+---------+--------+--------+
What I need is to get the best "combined" score per date. (With "combined" score I mean the best scores per user and per date summarized)
The result should look like this:
+----------+---------+--------------------------------------------+
| username | date | combined_score (max(scoreA) + max(scoreB)) |
+----------+---------+--------------------------------------------+
| jim | 01/2020 | 250 |
| max | 02/2020 | 350 |
+----------+---------+--------------------------------------------+
I came this far:
I can group the scores by user like this:
SELECT
username, (max(scoreA) + max(scoreB)) AS combined_score,
FROM score_table
GROUP BY username
ORDER BY combined_score DESC
And I can get the best score per date with PARTITION BY like this:
SELECT *
FROM
(SELECT t.*, row_number() OVER (PARTITION BY date ORDER BY scoreA DESC) rn
FROM score_table t) as tmp
WHERE tmp.rn = 1
ORDER BY date
Is there a proper way to combine these statements and get the result I need? Thank you!
Btw. Don't care about possible ties!
You can combine window functions and aggregation functions like this:
SELECT s.*
FROM (SELECT username, date, (max(scoreA) + max(scoreB)) AS combined_score,
ROW_NUMBER() OVER (PARTITION BY date ORDER BY max(scoreA) + max(scoreB) DESC) as seqnum
FROM score_table
GROUP BY username, date
) s
ORDER BY combined_score DESC;
Note that date needs to be part of the aggregation.

SQL Query to Find Min and Max Values between Values, dates and companies in the same Query

This is to find the historic max and min price of a stock in the same query for every past 10 days from the current date. below is the data. I've tried the query but getting the same high and low for all the rows. The high and low needs to be calculated per stock for a period of 10 days.
RDBMS -- SQL Server 2014
Note: also duration might be past 30 to 2months if required ie... 30 days. or 60 days.
for example, the output needs to be like ABB,16-12-2019,1480 (MaxClose),1222 (MinClose) (test data) in last 10 days.
+------+------------+-------------+
| Name | Date | Close |
+------+------------+-------------+
| ABB | 26-12-2019 | 1272.15 |
| ABB | 24-12-2019 | 1260.15 |
| ABB | 23-12-2019 | 1261.3 |
| ABB | 20-12-2019 | 1262 |
| ABB | 19-12-2019 | 1476 |
| ABB | 18-12-2019 | 1451.45 |
| ABB | 17-12-2019 | 1474.4 |
| ABB | 16-12-2019 | 1480.4 |
| ABB | 13-12-2019 | 1487.25 |
| ABB | 12-12-2019 | 1484.5 |
| INFY | 26-12-2019 | 73041.66667 |
| INFY | 24-12-2019 | 73038.33333 |
| INFY | 23-12-2019 | 73036.66667 |
| INFY | 20-12-2019 | 73031.66667 |
| INFY | 19-12-2019 | 73030 |
| INFY | 18-12-2019 | 73028.33333 |
| INFY | 17-12-2019 | 73026.66667 |
| INFY | 16-12-2019 | 73025 |
| INFY | 13-12-2019 | 73020 |
| INFY | 12-12-2019 | 73018.33333 |
+------+------------+-------------+
The query I tried but no luck
select max([close]) over (PARTITION BY name) AS MaxClose,
min([close]) over (PARTITION BY name) AS MinClose,
[Date],
name
from historic
where [DATE] between [DATE] -30 and [DATE]
and name='ABB'
group by [Date],
[NAME],
[close]
order by [DATE] desc
If you just want the highest and lowest close per name, then simple aggregation is enough:
select name, max(close) max_close, min(close) min_close
from historic
where close >= dateadd(day, -10, getdate())
group by name
order by name
If you want the entire corresponding records, then rank() is a solution:
select name, date, close
from (
select
h.*,
rank() over(partition by name order by close) rn1,
rank() over(partition by name order by close desc) rn2
from historic h
where close >= dateadd(day, -10, getdate())
) t
where rn1 = 1 or rn2 = 1
order by name, date
Top and bottom ties will show up if any.
You can add a where condition to filter on a given name.
If you are looking for a running min/max
Example
Select *
,MinClose = min([Close]) over (partition by name order by date rows between 10 preceding and current row)
,MaxClose = max([Close]) over (partition by name order by date rows between 10 preceding and current row)
From YourTable
Returns

T-SQL - Turn table with current page and previous pages into a sequential order per session

I'm trying to create a table to show the activy per session on a website.
Should look like something like that
Prefered table:
+------------+---------+--------------+-----------+
| SessionID | PageSeq| Page | Duration |
+------------+---------+--------------+-----------+
| 1 | 1 | Home | 5 |
| 1 | 2 | Sales | 10 |
| 1 | 3 | Contact | 9 |
| 2 | 1 | Sales | 5 |
| 3 | 1 | Home | 30 |
| 3 | 2 | Sales | 5 |
+------------+---------+--------------+-----------+
Unfortunetly my current dataset doesn't have information about the session_id, but can be deducted based on the time and the path.
Current table:
+------------------+---------+------------+---------------+----------+
| DATE_HOUR_MINUTE | Page | Prev_page | Total_session | Duration |
+------------------+---------+------------+---------------+----------+
| 201801012020 | Home | (entrance) | 24 | 5 |
| 201801012020 | Sales | Home | 24 | 10 |
| 201801012020 | Contact | Sales | 24 | 9 |
| 201801012020 | Sales | (entrance) | 5 | 5 |
| 201801012020 | Home | (entrance) | 35 | 30 |
| 201801012020 | Sales | Home | 35 | 5 |
+------------------+---------+------------+---------------+----------+
What is the best way to turn the current table into the prefered table format?
I've tried searching for nested tables, looped tables, haven't found a something related to this problem yet.
So if you can risk sessions starting at the same time with the same duration, should be easy enough to do using a recursive query.
;WITH sessionTree AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) as sessionId
, 1 AS PageSeq
, *
FROM Session
WHERE PrevPage = '(entrance)'
UNION ALL
SELECT prev.sessionId
, prev.PageSeq + 1
, next.*
FROM sessionTree prev
JOIN Session next
ON next.TotalDuration = prev.TotalDuration
AND next.PrevPage = prev.Page
AND next.date_hour_minute >= prev.date_hour_minute
)
SELECT * FROM sessionTree
ORDER BY sessionId, PageSeq
sessionId is generated for each entry with (entrance) as prevPage, with PageSeq = 1. Then in the recursive part visits with the timestamp later than the previous page and with the same duration are joined on prev.page = next.PrevPage condition.
Here's a working example on dbfiddle

SQL Server Active Record counts by month

I've created a database storing Incident tickets.
I have created a fact and a number of dimension tables.
Here is some sample data
+---------------------+--------------+--------------+-------------+------------+
| LastModifiedDateKey | TicketNumber | Status | factCurrent | Date |
+---------------------+--------------+--------------+-------------+------------+
| 2774 | T:9992260 | Open | 1 | 4/12/2017 |
| 2777 | T:9992805 | Open | 1 | 7/12/2017 |
| 2777 | T:9993068 | Open | 1 | 7/12/2017 |
| 2777 | T:9993098 | Open | 0 | 7/12/2017 |
| 2793 | T:9993098 | Acknowledged | 0 | 23/12/2017 |
| 2928 | T:9993098 | Closed | 1 | 5/01/2018 |
| 2777 | T:9993799 | Open | 0 | 7/12/2017 |
| 2928 | T:9993799 | Closed | 1 | 5/01/2018 |
| 2778 | T:9994729 | Open | 1 | 8/12/2017 |
| 2774 | T:9994791 | Open | 0 | 4/12/2017 |
| 2928 | T:9994791 | Closed | 1 | 5/01/2018 |
| 2777 | T:9994912 | Open | 1 | 7/12/2017 |
| 2778 | T:9995201 | Open | 0 | 8/12/2017 |
| 2793 | T:9995201 | Closed | 1 | 23/12/2017 |
| 2931 | T:9718629 | Open | 1 | 8/01/2018 |
| 2933 | T:9718629 | Closed | 1 | 10/01/2018 |
| 2932 | T:9855664 | Open | 1 | 9/01/2018 |
| 2931 | T:9891975 | Open | 1 | 8/01/2018 |
+---------------------+--------------+--------------+-------------+------------+
I want a query that will give me the total of tickets open at the end of each month.
In the data January should have 8 and Feb 2.
Note: that a ticket can have multiple rows with same status because a dimension key has changed or multiple rows with different status all in the same month. e.g. T:9993098.
This approach first uses ROW_NUMBER to identify the most recent record for each ticket, for each month/year. It is assumed that the most recent record in a month will contain the status in which a ticket ended for that month. Then, it aggregates over this modified table, counting only tickets which ended the month in an open status.
SELECT
YEAR(Date) + "-" + MONTH(Date) AS date,
COUNT(*) AS num_open_tickets
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARITION BY YEAR(Date), MONTH(Date), TicketNumber
ORDER BY BY Date DESC) rn
FROM yourTable
) t
WHERE t.rn = 1 AND t.Status = 'Open'
GROUP BY
YEAR(Date) + "-" + MONTH(Date);
First, I would generate the months. Then do a cumulative count of the opens minus the closes. Alas, that is a bit tricky because of the repeated rows for a ticket and because you are using an old version of SQL Server.
But . . . you can do this:
with months as (
select dateadd(day, 1 - day(min(date)), min(date)) as mon_start,
max(date) as max_date
from sample
union all
select dateadd(month, 1, mon_start), max_date
from months
where dateadd(month, 1, mon_start) < max_date
)
select m.mon_end,
(select count(distinct case when status = 'Open' then ticket end) -
count(distinct case when status = 'Closed' then ticket end)
from sample s
where s.date <= m.mon_end
) as open_tickets
from (select dateadd(day, -1, mon_start) as mon_end
from months
) m;
This uses a recursive CTE to generate the months. It is easier to generate the first day of the months and then subtract one day afterwards (what is the date when you add 1 month to the last day of February?)
The rest uses a correlated subquery to count the number of open tickets on that date.

SQL Server: how do I get data from a history table?

Can you please help me build an SQL query to retrieve data from a history table?
I'm a newbie with only a one-week coding experience. I've been trying simple SELECT statements so far but have hit a stumbling block.
My football club's database has three tables. The first one links balls to players:
BallDetail
| BallID | PlayerID | TeamID |
|-------------------|--------|
| 1 | 11 | 21 |
| 2 | 12 | 22 |
The second one lists things that happen to the balls:
BallEventHistory
| BallID | Event | EventDate |
|--------|------ |------------|
| 1 | Pass | 2012-01-01 |
| 1 | Shoot | 2012-02-01 |
| 1 | Miss | 2012-03-01 |
| 2 | Pass | 2012-01-01 |
| 2 | Shoot | 2012-02-01 |
And the third one is a history change table. After a ball changes hands, history is recorded:
HistoryChanges
| BallID | ColumnName | ValueOld | ValueNew |
|--------|------------|----------|----------|
| 2 | PlayerID | 11 | 12 |
| 2 | TeamID | 21 | 22 |
I'm trying to obtain a table that would list all passes and shoots Player 11 had done to all balls before the balls went to other players. Like this:
| PlayerID | BallID | Event | Month |
|----------|--------|-------|-------|
| 11 | 1 | Pass | Jan |
| 11 | 1 | Shoot | Feb |
| 11 | 2 | Pass | Jan |
I begin so:
SELECT PlayerID, BallID, Event, DateName(month, EventDate)
FROM BallDetail bd INNER JOIN BallEventHistory beh ON bd.BallID = beh.BallID
WHERE PlayerID = 11 AND Event IN (Pass, Shoot) ...
But how to make sure that Ball 2 also gets included despite being with another player now?
Select PlayerID,BallID,Event,datename(month,EventDate) as Month,Count(*) as cnt from
(
Select
Coalesce(
(Select ValueNew from #HistoryChanges where ChangeDate=(Select max(ChangeDate) from #HistoryChanges h2 where h2.BallID=h.BallID and ColumnName='PlayerID' and ChangeDate<=EventDate) and BallID=h.BallID and ColumnName='PlayerID')
,(Select PlayerID from #BallDetail where BallID=h.BallID)
) as PlayerID,
h.BallID,h.Event,EventDate
from #BallEventHistory h
) a
Group by PlayerID, BallID, Event,datename(month,EventDate)
SELECT d.PlayerID, d.BallID, h.Event, DATENAME(mm, h.EventDate) AS Month
FROM BallDetail d JOIN BallEventHistory h ON d.BallID = h.BallID
WHERE h.Event IN ('Pass', 'Shoot') AND d.PlayerID = 11
OR EXISTS (SELECT 1
FROM dbo.HistoryChanges c
WHERE c.ValueOld = 11 AND c.ValueNew = d.PlayerID AND c.ColumnName = 'PlayerID' and c.ChangeDate = h.EventDate)