Integrate row information with previous rows SQL - sql

I need to integrate row information with previous rows
| ID | no| number |
+--------------+-----+--------+
| 1 | 40| 10 |
| 2 | 32| 12 |
| 3 | 40| 15 |
| 4 | 45| 23 |
| 5 | 32| 15 |
| 6 | 12| 14
| 7 | 40| 20
| 8 | 32| 18
| 9 | 45| 27
| 10 | 12| 16
Desired result :
| ID | no | number | last number
+--------------+-----+--------+-------------
| 1 | 40 | 10 | 0
| 3 | 32 | 12 | 0
| 3 | 40 | 15 | 0
| 4 | 45 | 23 | 0
| 5 | 32 | 15 | 12
| 6 | 12 | 14 | 0
| 7 | 40 | 20 | 15
| 8 | 32 | 18 | 15
| 9 | 45 | 27 | 23
| 10 | 12 | 16 | 14

The best guess from me is - you are looking for a script as below. But according to the below logic, row with "id = 3" should get 10 as value in the column 'last number'
You can check the DEMO HERE
SELECT *,
ISNULL
(
(
SELECT number
FROM your_table C
WHERE C.ID =
(
SELECT MAX(ID) FROM your_table B WHERE B.ID < A.ID AND B.no = A.No
)
)
,0) [last number]
FROM your_table A
Output is-
ID no number last number
1 40 10 0
2 32 12 0
3 40 15 10
4 45 23 0
5 32 15 12
6 12 14 0
7 40 20 15
8 32 18 15
9 45 27 23
10 12 16 14

Related

Is there a way to shuffle rows in a table into distinctive fixed size chunks using SQL only?

I have a very big table (~300 million rows) with the following structure:
my_table(id, group, chunk, new_id), where chunk and new_id are set to NULL.
I want to set the rows of each group to a random chunk with distinct new_id in the chunk. Each chunk should be of fixed size of 100.
For example if group A has 1278 rows, they should go into 13 chunks (0-12), 12 chunks with 100 rows s.t. new_id are in range (0-99) and another single chunk with 78 rows s.t. new_id are in range (0-77).
The organization into chunks and within the chunks should be a random permutation where each row in A is assigned with a unique (chunk, new_id) tuple.
I'm successfully doing it using pandas but it takes hours, mostly due to memory and bandwidth limitations.
Is it possible to execute using only a SQL query?
I'm using postgres 9.6.
You could do this with row_number():
select id, group, rn / 100 chunk, rn % 100 new_id
from (select t.*, row_number() over(order by random()) - 1 rn from mytable t) t
The inner query assigns a random integer number to each record (starting at 0). The outer query does arithmetic to compute the chunk and new id.
If you want an update query:
update mytable t set chunk = x.rn / 3, new_id = x.rn % 3
from (select id, row_number() over(order by random()) - 1 rn from mytable t) x
where x.id = t.id
Demo on DB Fiddle for a dataset of 20 records with chunks of 3 records .
Before:
id | grp | chunk | new_id
-: | --: | ----: | -----:
1 | 1 | nullnull
2 | 2 | nullnull
3 | 3 | nullnull
4 | 4 | nullnull
5 | 5 | nullnull
6 | 6 | nullnull
7 | 7 | nullnull
8 | 8 | nullnull
9 | 9 | nullnull
10 | 10 | nullnull
11 | 11 | nullnull
12 | 12 | nullnull
13 | 13 | nullnull
14 | 14 | nullnull
15 | 15 | nullnull
16 | 16 | nullnull
17 | 17 | nullnull
18 | 18 | nullnull
19 | 19 | nullnull
20 | 20 | nullnull
After:
id | grp | chunk | new_id
-: | --: | ----: | -----:
19 | 19 | 0 | 0
11 | 11 | 0 | 1
20 | 20 | 0 | 2
12 | 12 | 1 | 0
14 | 14 | 1 | 1
17 | 17 | 1 | 2
3 | 3 | 2 | 0
8 | 8 | 2 | 1
5 | 5 | 2 | 2
13 | 13 | 3 | 0
10 | 10 | 3 | 1
2 | 2 | 3 | 2
16 | 16 | 4 | 0
18 | 18 | 4 | 1
6 | 6 | 4 | 2
1 | 1 | 5 | 0
15 | 15 | 5 | 1
7 | 7 | 5 | 2
4 | 4 | 6 | 0
9 | 9 | 6 | 1

Reset sum when condition is met in Oracle

My data is structured as follows:
Timestamp | Hour | Count
--------------------------
20190801 01 | 1 | 10
20190801 02 | 2 | 20
20190801 03 | 3 | 10
20190801 04 | 4 | 5
20190801 05 | 5 | 15
20190801 06 | 6 | 10
20190802 01 | 1 | 5
20190802 02 | 2 | 20
20190802 03 | 3 | 5
20190802 04 | 4 | 15
20190802 05 | 5 | 20
20190802 06 | 6 | 5
20190803 01 | 1 | 30
I'm trying to make an SQL query that will calculate a running SUM but resets when the hour is 3. The result should look like this:
Hour | Count | SUM
------------------
1 | 10 | 10
2 | 20 | 30
3 | 10 | 10 /* RESET */
4 | 5 | 15
5 | 15 | 30
6 | 10 | 40
1 | 5 | 45
2 | 20 | 65
3 | 5 | 5 /* RESET */
4 | 15 | 20
5 | 20 | 40
6 | 5 | 45
1 | 30 | 75
You could create subgroup using conditional sum:
WITH cte AS (
SELECT t.*,SUM(CASE WHEN hour=3 THEN 1 ELSE 0 END) OVER(ORDER BY timestamp) grp
FROM t
)
SELECT cte.*, SUM(Count) OVER(PARTITION BY grp ORDER BY timestamp) AS total
FROM cte

Checking for Consecutive 12 Weeks of 0 Sales

I have a table with customer_number, week, and sales. I need to check if there were 12 consecutive weeks of no sales for each customer and create a flag of 0/1.
I can check the last 12 weeks or a certain time frame, but what's the best way to check for consecutive runs? Here is the code I have so far:
select * from weekly_sales
where customer_nbr in (123, 234)
and week < '2015-11-01'
and week > '2014-11-01'
order by customer_nbr, week
;
Sql Fiddle Demo
Here is a simplify version only need a week_id and sales
SELECT S1.weekid start_week, MAX(S2.weekid) end_week, SUM (S2.sales)
FROM Sales S1
JOIN Sales S2
ON S2.weekid BETWEEN S1.weekid and S1.weekid + 11
WHERE S1.weekid BETWEEN 1 and 25 -- your search range
GROUP BY S1.weekid
Let me know if that work for you
OUTPUT
| start_week | end_week | |
|------------|----------|----|
| 1 | 12 | 12 |
| 2 | 13 | 8 |
| 3 | 14 | 3 |
| 4 | 15 | 2 |
| 5 | 16 | 0 | <-
| 6 | 17 | 0 | <- no sales for 12 week
| 7 | 18 | 0 | <-
| 8 | 19 | 4 |
| 9 | 20 | 9 |
| 10 | 21 | 11 |
| 11 | 22 | 15 |
| 12 | 23 | 71 |
| 13 | 24 | 78 |
| 14 | 25 | 86 |
| 15 | 25 | 86 | < - less than 12 week range
| 16 | 25 | 86 | < - below this line
| 17 | 25 | 86 |
| 18 | 25 | 86 |
| 19 | 25 | 86 |
| 20 | 25 | 82 |
| 21 | 25 | 77 |
| 22 | 25 | 75 |
| 23 | 25 | 71 |
| 24 | 25 | 15 |
| 25 | 25 | 8 |
Your final query should have
HAVING SUM (S2.sales) = 0
AND COUNT(*) = 12
Ummmmm...You could use between 'week' and 'week', and you can use too the "count(column)" in order to improve performance.
So you only have to compare if result is bigger than 0

postgres, add row when a value is missing

Forgive what may be a silly question, but I'm not much of a database guru.
I have a table with three columns. Here's a sample:
stationtest | id_date | val_no3
------------+-------------+---------
27 | 1 |
27 | 2 | 7
27 | 25 |
27 | 50 | 8
27 | 75 | 9
27 | 100 | 10
30 | 1 |
30 | 14 | 7
30 | 25 |
30 | 65 | 8
30 | 75 | 9
30 | 100 | 10
I would like to have a new table that have one row for each value id_date missing and it combines stationtest number,
like this one :
stationtest | id_date | val_no3
------------+-------------+---------
27 | 1 |
27 | 2 | 7
27 | 3 |
27 | 4 |
27 | 5 |
27 | 6 |
27 | (...) |
27 | 25 |
27 | 26 |
27 | 27 |
27 | (...) |
27 | 50 | 8
27 | (...) |
27 | 75 | 9
27 | (...) |
27 | 98 |
27 | 99 |
27 | 100 | 10
30 | 1 |
30 | 2 | 7
30 | 3 |
30 | 4 |
30 | 5 |
30 | 6 |
30 | (...) |
30 | 25 |
30 | 26 |
30 | 27 |
30 | (...) |
30 | 50 | 8
30 | 75 | 9
30 | (...) |
30 | 98 |
30 | 99 |
30 | 100 | 10
I have this query but i don't know how to make it work for each stationtest :
insert into tabletest (id_date)
select i
from generate_series(1, (select max(id_date) from tabletest)) i
left join tabletest on tabletest.id_date = i
where tabletest.id_date is null;
It is possible ? Thank you for help.
Try this:
DO $$
DECLARE
st_test integer;
i integer;
BEGIN
FOR st_test in (SELECT distinct stationtest FROM tabletest) LOOP
EXECUTE 'INSERT INTO test(stationtest, id_date) SELECT $1 as stationtest, generate_series as id_date FROM generate_series((SELECT min(id_date) FROM test), (SELECT max(id_date) FROM test))' USING st_test;
END LOOP;
END;
$$;
I don't have the data handy, but the general format should work.

SQL Joining 2 Tables

I would like to merge two tables into one and also add a counter next to that. What i have now is
SELECT [CUCY_DATA].*, [DIM].[Col1], [DIM].[Col2],
(SELECT COUNT([Cut Counter]) FROM [MSD]
WHERE [CUCY_DATA].[Cut Counter] = [MSD].[Cut Counter]
) AS [Nr Of Errors]
FROM [CUCY_DATA] FULL JOIN [DIM]
ON [CUCY_DATA].[Cut Counter] = [DIM].[Cut Counter]
This way the data is inserted but where the values don't match nulls are inserted. I want for instance this
Table CUCY_DATA
|_Cut Counter_|_Data1_|_Data2_|
| 1 | 12 | 24 |
| 2 | 13 | 26 |
| 3 | 10 | 20 |
| 4 | 11 | 22 |
Table DIM
|_Cut Counter_|_Col1_|_Col2_|
| 1 | 25 | 40 |
| 3 | 50 | 45 |
And they need to be merged into:
|_Cut Counter_|_Data1_|_Data2_|_Col1_|_Col2_|
| 1 | 12 | 24 | 25 | 40 |
| 2 | 13 | 26 | 25 | 40 |
| 3 | 10 | 20 | 50 | 45 |
| 4 | 11 | 22 | 50 | 45 |
SO THIS IS WRONG:
|_Cut Counter_|_Data1_|_Data2_|_Col1__|_Col2__|
| 1 | 12 | 24 | 25 | 40 |
| 2 | 13 | 26 | NULL | NULL |
| 3 | 10 | 20 | 50 | 45 |
| 4 | 11 | 22 | NULL | NULL |
Kind regards, Bob
How are you getting the col1 and col2 values where there is no corresponding row in your DIM table? (Rows 2 and 4). Your "wrong" result is exactly correct, that's what the outer join does.