Create Set of IDs based on Varying ID Ranges

Create Set of IDs based on Varying ID Ranges - sql

I have a collection of tables that have missing document ID ranges. The Start Range is the beginning ID, the End Range is the ending ID, and the Missing is the number of rows within that range that are missing (includes the beginning and end ID within its count). I was wondering how I can go about parsing into a new table the individual IDs as opposed to the actual range based on the Missing column.
This is how the data is presented:
+-------------+-----------+---------+-----------+
| Start Range | End Range | Missing | Date |
+-------------+-----------+---------+-----------+
| 184 | 186 | 3 | 1/9/1979 |
| 204 | 207 | 4 | 1/9/1979 |
| 209 | 212 | 4 | 1/9/1979 |
| 223 | 224 | 2 | 1/9/1979 |
| 240 | 241 | 2 | 1/10/1979 |
| 243 | 243 | 1 | 1/10/1979 |
| 248 | 249 | 2 | 1/10/1979 |
| 261 | 265 | 5 | 1/11/1979 |
+-------------+-----------+---------+-----------+
I am looking to acheive output as such:
+-----+-----------+
| ID | Date |
+-----+-----------+
| 184 | 1/9/1979 |
| 185 | 1/9/1979 |
| 186 | 1/9/1979 |
| 204 | 1/9/1979 |
| 205 | 1/9/1979 |
| 206 | 1/9/1979 |
| 207 | 1/9/1979 |
| 209 | 1/9/1979 |
| 210 | 1/9/1979 |
| 211 | 1/9/1979 |
| 212 | 1/9/1979 |
| 223 | 1/9/1979 |
| 224 | 1/9/1979 |
| 240 | 1/10/1979 |
| 241 | 1/10/1979 |
| 243 | 1/10/1979 |
| 248 | 1/10/1979 |
| 249 | 1/10/1979 |
| 261 | 1/11/1979 |
| 262 | 1/11/1979 |
| 263 | 1/11/1979 |
| 264 | 1/11/1979 |
| 265 | 1/11/1979 |
+-----+-----------+
What would be the best method to achieve this? Thanks for the assistance.

I use a UDF to generate ranges, but a numbers table or tally table would do the trick as well
Declare #Table table (StartRange int,EndRange int,Date Date)
Insert into #Table values
(184,186,'1979-01-09'),
(204,207,'1979-01-09')
Select B.ID
,A.Date
From #Table A
Join (Select ID=cast(RetVal as int) from [dbo].[udf-Create-Range-Number](1,9999,1)) B
on B.ID between A.StartRange and A.EndRange
Order by B.ID,Date
Returns
ID Date
184 1979-01-09
185 1979-01-09
186 1979-01-09
204 1979-01-09
205 1979-01-09
206 1979-01-09
207 1979-01-09
The UDF
CREATE FUNCTION [dbo].[udf-Create-Range-Number] (#R1 money,#R2 money,#Incr money)
-- Syntax Select * from [dbo].[udf-Create-Range-Number](0,100,2)
Returns
#ReturnVal Table (RetVal money)
As
Begin
With NumbTable as (
Select NumbFrom = #R1
union all
Select nf.NumbFrom + #Incr
From NumbTable nf
Where nf.NumbFrom < #R2
)
Insert into #ReturnVal(RetVal)
Select NumbFrom from NumbTable Option (maxrecursion 32767)
Return
End

Build a tally table or CTE.
tally table
Which is just a single column of incrementing integers n
Select tally.n as Id, missingRange.date
from tally
inner join missingRange
On tally.n >= beginRange
And tally.n <= endRange

Related

Average by day using timestamp

I have the following mariadb table. The data is added 3 times per day. I am looking to write a SQL query that would give me the average amount for the day. This way I can say on May 13 'serender' averaged x amt, 'shilta' averaged x amt and 'snowq' averaged x amt. On May 14th the averages were... and so on for each date.
key | timestamp | card | amt |
-------------------------------------------
| 126 | 1620837006 | serender | 8040 |
| 127 | 1620837006 | shilta | 752 |
| 128 | 1620837006 | snowq | 308 |
| 132 | 1620862207 | serender | 846 |
| 133 | 1620862207 | shilta | 803 |
| 134 | 1620862207 | snowq | 759 |
| 139 | 1620894616 | serender | 845 |
| 140 | 1620894616 | shilta | 805 |
| 141 | 1620894616 | snowq | 759 |
| 146 | 1620923404 | serender | 869 |
| 147 | 1620923404 | shilta | 804 |
| 148 | 1620923404 | snowq | 759 |
| 153 | 1620948607 | serender | 755 |
| 154 | 1620948607 | shilta | 650 |
| 155 | 1620948607 | snowq | 530 |

If you want to see the date then convert it from a Unix timestamp to a date:
select date(from_unixtime(timstamp)) as dte, card, avg(amt)
from t
group by dte, card;

Getting two columns one containing and one not containing a grouped value

My data looks like this -
+-----------+-----------+-----------+----------+
| FLIGHT_NO | FL_DATE | SERIAL_NO | PILOT_NO |
+-----------+-----------+-----------+----------+
| 501 | 15-OCT-19 | 456710 | 345 |
| 521 | 16-OCT-19 | 562911 | 345 |
| 534 | 17-OCT-19 | 877694 | 345 |
| 577 | 17-OCT-19 | 338157 | 345 |
| 501 | 14-OCT-19 | 921225 | 346 |
| 534 | 15-OCT-19 | 877694 | 346 |
| 534 | 14-OCT-19 | 338157 | 347 |
| 590 | 16-OCT-19 | 650012 | 347 |
| 531 | 14-OCT-19 | 562911 | 348 |
| 531 | 15-OCT-19 | 562911 | 348 |
| 501 | 16-OCT-19 | 220989 | 349 |
| 521 | 18-OCT-19 | 650012 | 349 |
| 590 | 14-OCT-19 | 562911 | 351 |
| 577 | 18-OCT-19 | 877694 | 351 |
| 590 | 18-OCT-19 | 456710 | 346 |
+-----------+-----------+-----------+----------+
My aim is to return the total number of flights flying and not flying on 18-oct-19.
I'm doing it with dual but that doesn't seem to be the correct/best method.
Can anyone help me do it the correct way?
SELECT
(SELECT COUNT(FLIGHT_NO) NO_FLY FROM schd_flight WHERE fl_date = '18-OCT-19') AS FLY,
(SELECT COUNT(FLIGHT_NO) NO_FLY FROM schd_flight WHERE fl_date <> '18-OCT-19') AS NO_FLY
FROM dual;
My output -
+-----+--------+
| fly | no_fly |
+-----+--------+
| 3 | 12 |
+-----+--------+

Simply use sum with case statement
Select
sum(case when fl_date = '18-OCT-19' then 1 end) fly,
sum(case when fl_date <> '18-OCT-19' then 1 end) no_fly
From schd_flight;
Cheers!!

I think the second query is not necessary, no_fly = total - fly.
So I came up with my solution, may improve the query time :
SELECT sub.FLY as FLY, (SELECT count(*) from schd_flight) - sub.FLY as NO_FLY
FROM (
SELECT COUNT(CASE when fl_date = '18-OCT-19' then 1 end) AS FLY
from schd_flight
) sub;
Not tested yet though.

SQL: How do I get parent and child records for data maintained in same table

This is how my table (locationgroup) looks like. Parent and child relationships are being maintained at the same table and there can many more records (ids might not be in sequence)
+------+------------+------------+--------------+---------------+-----------+
| id | name | parentid | customerid | type | deleted |
|------+------------+------------+--------------+---------------+-----------|
| 131 | Zone | 0 | 79 | zone | False |
| 132 | State | 131 | 79 | state | False |
| 136 | Center 3 | 133 | 79 | servicecentre | False |
| 134 | Center 1 | 133 | 79 | servicecentre | False |
| 135 | Center 2 | 133 | 79 | servicecentre | False |
| 133 | City | 132 | 79 | city | False |
| 137 | Center 4 | 133 | 79 | servicecentre | False |
+------+------------+------------+--------------+---------------+-----------+
What I want to achieve is to get parent(s) and child(s) of any given id.
Eg: for id - 131 the result should be
+------+------------+------------+--------------+---------------+-----------+
| id | name | parentid | customerid | type | deleted |
|------+------------+------------+--------------+---------------+-----------|
| 131 | Zone | 0 | 79 | zone | False |
| 132 | State | 131 | 79 | state | False |
| 133 | City | 132 | 79 | city | False |
| 134 | Center 1 | 133 | 79 | servicecentre | False |
| 135 | Center 2 | 133 | 79 | servicecentre | False |
| 136 | Center 3 | 133 | 79 | servicecentre | False |
| 137 | Center 4 | 133 | 79 | servicecentre | False |
+------+------------+------------+--------------+---------------+-----------+
thus, for id - 137 the result should be
+------+------------+------------+--------------+---------------+-----------+
| id | name | parentid | customerid | type | deleted |
|------+------------+------------+--------------+---------------+-----------|
| 131 | Zone | 0 | 79 | zone | False |
| 132 | State | 131 | 79 | state | False |
| 133 | City | 132 | 79 | city | False |
| 137 | Center 4 | 133 | 79 | servicecentre | False |
+------+------------+------------+--------------+---------------+-----------+
I am able to get only child records with my query
WITH RECURSIVE locgrp AS (
SELECT
lg.*
FROM locationgroup lg
WHERE lg.customerid = 79 AND lg.id IN (133) AND lg.deleted = FALSE
UNION
SELECT
lg_union_1.*
FROM locationgroup lg_union_1
INNER JOIN locgrp lg_union_2 ON lg_union_1.parentid = lg_union_2.id
WHERE lg_union_1.deleted = FALSE AND lg_union_2.deleted = FALSE
)
SELECT *
FROM locgrp ORDER BY id ASC;
Eg: for id - 137 what I get is
+------+------------+------------+--------------+---------------+-----------+
| id | name | parentid | customerid | type | deleted |
|------+------------+------------+--------------+---------------+-----------|
| 137 | Center 4 | 133 | 79 | servicecentre | False |
+------+------------+------------+--------------+---------------+-----------+
I can achieve the desired result by changing line
INNER JOIN locgrp lg_union_2 ON lg_union_1.parentid = lg_union_2.id
in query to
INNER JOIN locgrp lg_union_2 ON lg_union_1.id = lg_union_2.parentid
but then they're two different queries for the same purpose.
How to modify my query to get parent and child records with the same query. There's is no restriction that I should stick to recursive query or anything.

Since you're good with a function, something like this should work. I don't pretend it's the coolest looking thing, but I do believe it yields the correct results:
CREATE OR REPLACE FUNCTION recurse_me(cust_id integer, loc_id integer)
RETURNS SETOF locationgroup AS
$BODY$
declare
rw locationgroup%rowtype;
begin
FOR rw IN
WITH RECURSIVE locgrp AS (
SELECT
lg.*
FROM locationgroup lg
WHERE lg.customerid = cust_id AND lg.id = loc_id AND lg.deleted = FALSE
UNION
SELECT
lg_union_1.*
FROM locationgroup lg_union_1
INNER JOIN locgrp lg_union_2 ON lg_union_1.parentid = lg_union_2.id
WHERE lg_union_1.deleted = FALSE AND lg_union_2.deleted = FALSE
)
SELECT * FROM locgrp
LOOP
return next rw;
END LOOP;
FOR rw IN
WITH RECURSIVE locgrp AS (
SELECT
lg.*
FROM locationgroup lg
WHERE lg.customerid = cust_id AND lg.id = loc_id AND lg.deleted = FALSE
UNION
SELECT
lg_union_1.*
FROM locationgroup lg_union_1
INNER JOIN locgrp lg_union_2 ON lg_union_1.id = lg_union_2.parentid
WHERE lg_union_1.deleted = FALSE AND lg_union_2.deleted = FALSE
)
SELECT * FROM locgrp where id != loc_id
LOOP
return next rw;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
Implementation:
postgres=# select * from recurse_me (79, 133) order by id;
id | name | parentid | customerid | type | deleted
-----+--------------+----------+------------+-----------------+---------
131 | Zone | 0 | 79 | zone | f
132 | State | 131 | 79 | state | f
133 | City | 132 | 79 | city | f
134 | Center 1 | 133 | 79 | servicecentre | f
135 | Center 2 | 133 | 79 | servicecentre | f
136 | Center 3 | 133 | 79 | servicecentre | f
137 | Center 4 | 133 | 79 | servicecentre | f
(7 rows)
postgres=# select * from recurse_me (79, 137) order by id;
id | name | parentid | customerid | type | deleted
-----+--------------+----------+------------+-----------------+---------
131 | Zone | 0 | 79 | zone | f
132 | State | 131 | 79 | state | f
133 | City | 132 | 79 | city | f
137 | Center 4 | 133 | 79 | servicecentre | f
(4 rows)

equating an entry to an aggregated version of itself

I am trying to find if an entry's value is the max of the grouped value. Its purpose is to sit in a larger if logic.
Which I'd expect would look something like this:
SELECT
t.id as t_id,
sum(if(t.value = max(t.value), 1, 0)) AS is_max_value
FROM dataset.table AS t
GROUP BY t_id
The response is:
Error: Expression 't.value' is not present in the GROUP BY list
How should my code look to do this?

You first need to compile in a subquery the max value, then join again the value to the table.
Using the public data set available here is an example:
SELECT
t.word,
t.word_count,
t.corpus_date
FROM
[publicdata:samples.shakespeare] t
JOIN (
SELECT
corpus_date,
MAX(word_count) word_count,
FROM
[publicdata:samples.shakespeare]
GROUP BY
1 ) d
ON
d.corpus_date=t.corpus_date
AND t.word_count=d.word_count
LIMIT
25
Results:
+-----+--------+--------------+---------------+---+
| Row | t_word | t_word_count | t_corpus_date | |
+-----+--------+--------------+---------------+---+
| 1 | the | 762 | 1597 | |
| 2 | the | 894 | 1598 | |
| 3 | the | 841 | 1590 | |
| 4 | the | 680 | 1606 | |
| 5 | the | 942 | 1607 | |
| 6 | the | 779 | 1609 | |
| 7 | the | 995 | 1600 | |
| 8 | the | 937 | 1599 | |
| 9 | the | 738 | 1612 | |
| 10 | the | 612 | 1595 | |
| 11 | the | 848 | 1592 | |
| 12 | the | 753 | 1594 | |
| 13 | the | 740 | 1596 | |
| 14 | I | 828 | 1603 | |
| 15 | the | 525 | 1608 | |
| 16 | the | 363 | 0 | |
| 17 | I | 629 | 1593 | |
| 18 | I | 447 | 1611 | |
| 19 | the | 715 | 1602 | |
| 20 | the | 717 | 1610 | |
+-----+--------+--------------+---------------+---+
You can see that retains the word that have the maximum word_count in the partition defined by corpus_date

Use window function to "spread" the max value over all relevant records.
this way you can avoid the Join.
SELECT
*
FROM (
SELECT
corpus,
corpus_date,
word,
word_count,
MAX(word_count) OVER (PARTITION BY corpus) AS Max_Word_Count
FROM
[publicdata:samples.shakespeare] )
WHERE
word_count=Max_Word_Count

select
id,
value,
integer(value = max_value) as is_max_value
from (
select id, value, max(value) over(partition by id) as max_value
from dataset.table
)
Explanation:
Inner select - for each row/record calculates max of value among all rows with the same id
Outer select - for each row/record compares row's value with max value for respective group and then converts true or false into respectively 1 or 0 (as per expectation in question)

Subquery for max ID numbers

I have a query that I am trying to filter for a report. Each addressID can have multiple jobs and each job can have multiple elements to it.
Basically I am trying to get the maximum jobID for each addressID, but I want to get each element of the job.
The current Query results are:
+-----------+-------+--------+
| AddressID | JobID | Cost |
+-----------+-------+--------+
| 326 | 328 | £52.50 |
| 327 | 329 | £55.13 |
| 328 | 330 | £57.88 |
| 329 | 331 | £60.78 |
| 329 | 331 | £63.81 |
| 330 | 332 | £67.00 |
| 330 | 332 | £70.36 |
| 330 | 332 | £73.87 |
| 330 | 332 | £77.57 |
| 330 | 333 | £57.75 |
| 330 | 333 | £60.64 |
| 330 | 333 | £63.67 |
| 330 | 333 | £66.85 |
| 331 | 334 | £70.20 |
| 331 | 334 | £73.71 |
| 331 | 335 | £77.39 |
| 331 | 336 | £81.26 |
| 331 | 336 | £85.32 |
| 331 | 336 | £89.59 |
+-----------+-------+--------+
And I am trying to get:
+-----------+-------+--------+
| AddressID | JobID | Cost |
+-----------+-------+--------+
| 326 | 328 | £52.50 |
| 327 | 329 | £55.13 |
| 328 | 330 | £57.88 |
| 329 | 331 | £60.78 |
| 329 | 331 | £63.81 |
| 330 | 333 | £57.75 |
| 330 | 333 | £60.64 |
| 330 | 333 | £63.67 |
| 330 | 333 | £66.85 |
| 331 | 336 | £81.26 |
| 331 | 336 | £85.32 |
| 331 | 336 | £89.59 |
+-----------+-------+--------+
I had been looking at SELECT TOP 1 to isolate the MAX JobID, but ended up limiting the query to just 1 entry.
Currently tweaking this subquery, but still not sure I'm on the right track:
(SELECT Max(vusearch.JobID) FROM vuSearch AS T WHERE PAID = vuSearch.AddressID GROUP BY AddressID)
Can anyone advise?

Here is one method:
select v.*
from vusearch as v
where v.JobId = (select max(v2.JobId)
from vusearch as v2
where v2.AddressId = v.AddressId
);

Managed to get it fixed - I probably hadn't provided enough information as I was trying to keep my explanation simple.
Many thanks for your help Gordon
((vuSearch.PDID) IN ( (SELECT Max(v2.PDID) FROM vuSearch AS v2 GROUP BY v2.PAID)))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Create Set of IDs based on Varying ID Ranges - sql

Build a tally table or CTE. tally table Which is just a single column of incrementing integers n Select tally.n as Id, missingRange.date from tally inner join missingRange On tally.n >= beginRange And tally.n <= endRange

Related

Average by day using timestamp

Getting two columns one containing and one not containing a grouped value

SQL: How do I get parent and child records for data maintained in same table

equating an entry to an aggregated version of itself

Subquery for max ID numbers

Categories

Resources