Generating query for given table - sql

I have table
id year month day
1 2012 1 1
2 2012 1 2
3 2012 1 3
4 2012 1 4
5 2012 1 5
6 2012 1 7
7 2012 1 8
8 2012 1 9
9 2012 1 10
10 2012 1 11
from the above table i want to generate the following output when count of id reaches the
multiple of 5
day_start day_end
1 5
5 11

I'm not sure where the Day_start came from. This may give you an idea on how to tackle this problem.
SELECT id as day_start, day as day_end
FROM MyTable
WHERE mod(id, 5) = 0

This will work in SQL Server. Using the row_number rather than the id field value will prevent errors if you delete rows and the sequence of ids is no longer complete.
Note: I'm basing this answer on the fact that you said the "count of id" reaches 5, and not "the value of id". Which to me means "every 5 records, regardless of the id value". If that's not the case, then leave a comment and I'll remove this answer, since Mark's will work fine.
select case when rownumber = 5 then 1 else rownumber - 5 end as day_start, day_end
from
(
select row_number() over (order by id) as RowNumber, [day] as day_end
from table1
) t
where rownumber % 5 = 0

Assuming MySQL, try:
select min(coalesce(p.day,c.day)) day_start, max(c.day) day_end
from my_table c
left join my_table p on p.id = c.id-1
group by floor((c.id-1)/5)

SELECT CASE id WHEN 5 THEN 1 ELSE id - 5 END AS day_start, day AS day_end
FROM [TableName]
WHERE id % 5 = 0

Related

SQL query to partition rows into groups where lag (difference between rows) is greater than some value

Suppose I have a table like
id
1
3
4
10
12
19
and I'd like to group the ids (in sorted order) into the same group if they differ by 5 or less, and a new group if they differ by 6 or more. So the output would be:
id
group
1
1
3
1
4
1
10
2
12
2
19
3
Is this possible in SQL? It will be a query in Trino, and I see they have commands like lag and partition. Has anyone made a query like this that can help out?
You can use a cte with lead:
with cte(id, l1) as (
select t.id, abs(coalesce(lead(t.id) over (order by t.id), 0) - t.id) < 6 from tbl t
)
select c.id, (select sum(c1.id < c.id and c1.l1 = 0) from cte c1) + 1 from cte c

SQL Server - group and number matching contiguous values

I have a list of stock transactions and I am using Over(Partition By) to calculate the running totals (positions) by security. Over time a holding in a particular security can be long, short or flat. I am trying to find an efficient way to extract only the transactions relating to the current position for each security.
I have created a simplified sqlfiddle to show what I have so far. The cte query generates the running total for each security (code_id) and identifies when the holdings are long (L), short (s) or flat (f). What I need is to group and number matching contiguous values of L, S or F for each code_id.
What I have so far is this:
; WITH RunningTotals as
(
SELECT
*,
RunningTotal = sum(qty) OVER (Partition By code_id Order By id)
FROM
TradeData
), LongShortFlat as
(
SELECT
*,
LSF = CASE
WHEN RunningTotal > 0 THEN 'L'
WHEN RunningTotal < 0 THEN 'S'
ELSE 'F'
END
FROM
RunningTotals
)
SELECT
*
FROM
LongShortFlat r
I think what I need to do is create a GroupNum column by applying a row_number for each group of L, S and F within each code_id so the results look like this:
id code_id qty RunningTotal LSF GroupNum
1 1 5 5 L 1
2 1 2 7 L 1
3 1 7 14 L 1
4 1 -3 11 L 1
5 1 -5 6 L 1
6 1 -6 0 F 2
7 1 5 5 L 3
8 1 5 10 L 3
9 1 -2 8 L 3
10 1 -4 4 L 3
11 2 5 5 L 1
12 2 3 8 L 1
13 2 -4 4 L 1
14 2 -2 2 L 1
15 2 -2 0 F 2
16 2 6 6 L 3
17 2 -5 1 L 3
18 2 -5 -4 S 4
19 2 2 -2 S 4
20 2 4 2 L 5
21 2 -5 -3 S 6
22 2 -2 -5 S 6
23 3 5 5 L 1
24 3 2 7 L 1
25 3 1 8 L 1
I am struggling to generate the GroupNum column.
Thanks in advance for your help.
[Revised]
Sorry about that, I read your question too quickly. I came up with a solution using a recursive common table expression (below), then saw that you've worked out a solution using LAG. I'll post my revised query anyway, for posterity. Either way, the resulting query is (imho) pretty ugly.
;WITH cteBaseAgg
as (
-- Build the "sum increases over time" data
SELECT
row_number() over (partition by td.code_id order by td.code_id, td.Id) RecurseKey
,td.code_id
,td.id
,td.qty
,sum(tdPrior.qty) RunningTotal
,case
when sum(tdPrior.qty) > 0 then 'L'
when sum(tdPrior.qty) < 0 then 'S'
else 'F'
end LSF
from dbo.TradeData td
inner join dbo.TradeData tdPrior
on tdPrior.code_id = td.code_id -- All for this code_id
and tdPrior.id <= td.Id -- For this and any prior Ids
group by
td.code_id
,td.id
,td.qty
)
,cteRecurse
as (
-- "Set" the first row for each code_id
SELECT
RecurseKey
,code_id
,id
,qty
,RunningTotal
,LSF
,1 GroupNum
from cteBaseAgg
where RecurseKey = 1
-- For each succesive row in each set, check if need to increment GroupNum
UNION ALL SELECT
agg.RecurseKey
,agg.code_id
,agg.id
,agg.qty
,agg.RunningTotal
,agg.LSF
,rec.GroupNum + case when rec.LSF = agg.LSF then 0 else 1 end
from cteBaseAgg agg
inner join cteRecurse rec
on rec.code_id = agg.code_id
and agg.RecurseKey - 1 = rec.RecurseKey
)
-- Show results
SELECT
id
,code_id
,qty
,RunningTotal
,LSF
,GroupNum
from cteRecurse
order by
code_id
,id
Sorry for making this question a bit more complicated than it needed to be but for the sake of closure I have found a solution using the lag function.
In order to achieve what I wanted I continued my cte above with the following:
, a as
(
SELECT
*,
Lag(LSF, 1, LSF) OVER(Partition By code_id ORDER BY id) AS prev_LSF,
Lag(code_id, 1, code_id) OVER(Partition By code_id ORDER BY id) AS prev_code
FROM
LongShortFlat
), b as
(
SELECT
id,
LSF,
code_id,
Sum(CASE
WHEN LSF <> prev_LSF AND code_id = prev_code
THEN 1
ELSE 0
END) OVER(Partition By code_id ORDER BY id) AS grp
FROM
a
)
select * from b order by id
Here is the updated sqlfiddle.

How to group by continuous rows in sqlite?

I have a table like:
id version count
1 0 3
2 0 4
3 0 3
4 1 3
5 1 2
6 1 1
7 0 3
8 0 5
I want to get a result like:
min_id version sum
1 0 10
4 1 6
7 0 8
If I use SELECT MIN(id), version, sum(count) group by version I get this:
min_id version sum
1 0 18
4 1 6
Because GROUP BY combines everything in the same version. I want to combine only those versions which are continuous, based on id.
This is hard to do in SQLite, but possible. Now, the performance is awful, but the idea is that you count the number of rows before any given row with a different id. This identifies each group! Voila!
select version, min(id), max(id), sum(count)
from (select t.*,
(select count(*) from t t2 where t2.version <> t.version and t2.id < t.id) as grp
from t
) t
group by version, grp;

Oracle SQL query to group consecutive records

I've imported data ("Amount" and "Narration") from a spreadsheet into a table and need help with a query to group consecutive records according to their "Narration", for example:
Expected output:
line_no amount narration calc_group <-Not part of table
----------------------------------------
1 10 Reason 1 1
2 -10 Reason 1 1
3 5 Reason 2 2
4 5 Reason 2 2
5 -10 Reason 2 2
6 -8 Reason 1 3
7 8 Reason 1 3
8 11 Reason 1 3
9 99 Reason 3 4
10 -99 Reason 3 4
I've tried some analytical functions:
select line_no, amount, narration,
first_value (line_no) over
(partition by narration order by line_no) "calc_group"
from test
order by line_no
But that does not work because the Narration of line 6 to 8 is the same as line 1 and 2.
line_no amount narration calc_group
----------------------------------------
1 10 Reason 1 1
2 -10 Reason 1 1
3 5 Reason 2 3
4 5 Reason 2 3
5 -10 Reason 2 3
6 -8 Reason 1 1
7 8 Reason 1 1
8 11 Reason 1 1
9 99 Reason 3 4
10 -99 Reason 3 4
UPDATE
I've managed to do it using lag analytical function and sequences, not very elegant but it works. There should be a better way, please comment!
create or replace function get_next_test_seq
return number
as
begin
return test_seq.nextval;
end get_next_test_seq;
create or replace function get_curr_test_seq
return number
as
begin
return test_seq.currval;
end get_curr_test_seq;
update test
set group_no =
(with cte1
as (select line_no, amount, narration,
lag (narration) over (order by line_no) prev_narration, group_no
from test
order by line_no),
cte2
as (select line_no, amount, narration, group_no,
case when prev_narration is null or prev_narration <> narration then get_next_test_seq else get_curr_test_seq end new_group_no
from cte1)
select new_group_no
from cte2
where cte2.line_no = test.line_no);
UPDATE 2
I'm satisfied with the better accepted answer. Thanks kordiko!
Try this query:
SELECT line_no,
amount,
narration,
SUM( x ) OVER ( ORDER BY line_no
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) as calc_group
FROM (
SELECT t.*,
CASE lag( narration ) OVER (order by line_no )
WHEN narration THEN 0
ELSE 1 END x
FROM test t
)
ORDER BY line_no
demo --> http://www.sqlfiddle.com/#!4/6d7aa/9

What is the SQL for 'next' and 'previous' in a table?

I have a table of items, each of which has a date associated with it. If I have the date associated with one item, how do I query the database with SQL to get the 'previous' and 'subsequent' items in the table?
It is not possible to simply add (or subtract) a value, as the dates do not have a regular gap between them.
One possible application would be 'previous/next' links in a photo album or blog web application, where the underlying data is in a SQL table.
I think there are two possible cases:
Firstly where each date is unique:
Sample data:
1,3,8,19,67,45
What query (or queries) would give 3 and 19 when supplied 8 as the parameter? (or the rows 3,8,19). Note that there are not always three rows to be returned - at the ends of the sequence one would be missing.
Secondly, if there is a separate unique key to order the elements by, what is the query to return the set 'surrounding' a date? The order expected is by date then key.
Sample data:
(key:date) 1:1,2:3,3:8,4:8,5:19,10:19,11:67,15:45,16:8
What query for '8' returns the set:
2:3,3:8,4:8,16:8,5:19
or what query generates the table:
key date prev-key next-key
1 1 null 2
2 3 1 3
3 8 2 4
4 8 3 16
5 19 16 10
10 19 5 11
11 67 10 15
15 45 11 null
16 8 4 5
The table order is not important - just the next-key and prev-key fields.
Both TheSoftwareJedi and Cade Roux have solutions that work for the data sets I posted last night. For the second question, both seem to fail for this dataset:
(key:date) 1:1,2:3,3:8,4:8,5:19,10:19,11:67,15:45,16:8
The order expected is by date then key, so one expected result might be:
2:3,3:8,4:8,16:8,5:19
and another:
key date prev-key next-key
1 1 null 2
2 3 1 3
3 8 2 4
4 8 3 16
5 19 16 10
10 19 5 11
11 67 10 15
15 45 11 null
16 8 4 5
The table order is not important - just the next-key and prev-key fields.
Select max(element) From Data Where Element < 8
Union
Select min(element) From Data Where Element > 8
But generally it is more usefull to think of sql for set oriented operations rather than iterative operation.
Self-joins.
For the table:
/*
CREATE TABLE [dbo].[stackoverflow_203302](
[val] [int] NOT NULL
) ON [PRIMARY]
*/
With parameter #val
SELECT cur.val, MAX(prv.val) AS prv_val, MIN(nxt.val) AS nxt_val
FROM stackoverflow_203302 AS cur
LEFT JOIN stackoverflow_203302 AS prv
ON cur.val > prv.val
LEFT JOIN stackoverflow_203302 AS nxt
ON cur.val < nxt.val
WHERE cur.val = #val
GROUP BY cur.val
You could make this a stored procedure with output parameters or just join this as a correlated subquery to the data you are pulling.
Without the parameter, for your data the result would be:
val prv_val nxt_val
----------- ----------- -----------
1 NULL 3
3 1 8
8 3 19
19 8 45
45 19 67
67 45 NULL
For the modified example, you use this as a correlated subquery:
/*
CREATE TABLE [dbo].[stackoverflow_203302](
[ky] [int] NOT NULL,
[val] [int] NOT NULL,
CONSTRAINT [PK_stackoverflow_203302] PRIMARY KEY CLUSTERED (
[ky] ASC
)
)
*/
SELECT cur.ky AS cur_ky
,cur.val AS cur_val
,prv.ky AS prv_ky
,prv.val AS prv_val
,nxt.ky AS nxt_ky
,nxt.val as nxt_val
FROM (
SELECT cur.ky, MAX(prv.ky) AS prv_ky, MIN(nxt.ky) AS nxt_ky
FROM stackoverflow_203302 AS cur
LEFT JOIN stackoverflow_203302 AS prv
ON cur.ky > prv.ky
LEFT JOIN stackoverflow_203302 AS nxt
ON cur.ky < nxt.ky
GROUP BY cur.ky
) AS ordering
INNER JOIN stackoverflow_203302 as cur
ON cur.ky = ordering.ky
LEFT JOIN stackoverflow_203302 as prv
ON prv.ky = ordering.prv_ky
LEFT JOIN stackoverflow_203302 as nxt
ON nxt.ky = ordering.nxt_ky
With the output as expected:
cur_ky cur_val prv_ky prv_val nxt_ky nxt_val
----------- ----------- ----------- ----------- ----------- -----------
1 1 NULL NULL 2 3
2 3 1 1 3 8
3 8 2 3 4 19
4 19 3 8 5 67
5 67 4 19 6 45
6 45 5 67 NULL NULL
In SQL Server, I prefer to make the subquery a Common table Expression. This makes the code seem more linear, less nested and easier to follow if there are a lot of nestings (also, less repetition is required on some re-joins).
Firstly, this should work (the ORDER BY is important):
select min(a)
from theTable
where a > 8
select max(a)
from theTable
where a < 8
For the second question that I begged you to ask...:
select *
from theTable
where date = 8
union all
select *
from theTable
where key = (select min(key)
from theTable
where key > (select max(key)
from theTable
where date = 8)
)
union all
select *
from theTable
where key = (select max(key)
from theTable
where key < (select min(key)
from theTable
where date = 8)
)
order by key
SELECT 'next' AS direction, MIN(date_field) AS date_key
FROM table_name
WHERE date_field > current_date
GROUP BY 1 -- necessity for group by varies from DBMS to DBMS in this context
UNION
SELECT 'prev' AS direction, MAX(date_field) AS date_key
FROM table_name
WHERE date_field < current_date
GROUP BY 1
ORDER BY 1 DESC;
Produces:
direction date_key
--------- --------
prev 3
next 19
My own attempt at the set solution, based on TheSoftwareJedi.
First question:
select date from test where date = 8
union all
select max(date) from test where date < 8
union all
select min(date) from test where date > 8
order by date;
Second question:
While debugging this, I used the data set:
(key:date) 1:1,2:3,3:8,4:8,5:19,10:19,11:67,15:45,16:8,17:3,18:1
to give this result:
select * from test2 where date = 8
union all
select * from (select * from test2
where date = (select max(date) from test2
where date < 8))
where key = (select max(key) from test2
where date = (select max(date) from test2
where date < 8))
union all
select * from (select * from test2
where date = (select min(date) from test2
where date > 8))
where key = (select min(key) from test2
where date = (select min(date) from test2
where date > 8))
order by date,key;
In both cases the final order by clause is strictly speaking optional.
If your RDBMS supports LAG and LEAD, this is straightforward (Oracle, PostgreSQL, SQL Server 2012)
These allow to choose the row either side of any given row in a single query
Try this...
SELECT TOP 3 * FROM YourTable
WHERE Col >= (SELECT MAX(Col) FROM YourTable b WHERE Col < #Parameter)
ORDER BY Col