SQL Average Data Based on Distance - sql

I'm pretty new to SQL.
I have a database with records based on road/milepoints. My goal is to get an average value every 52.8 ft along the road. My related table has data every 15 ft, this table of course has a foreign key relating it to the primary table.
If I wanted to pull out the average value every 52.8 ft, along a given milepost, how would I go about this?
Example Data:
RecID Begin_MP End_MP
100 0 0.56
RecID MP Value1 Value2
100 0 159 127.7
100 0.003 95.3 115.3
100 0.006 82.3 107
100 0.009 56.5 74.5
100 0.011 58.1 89.1
100 0.014 95.2 78.8
100 0.017 108.9 242.5
100 0.02 71.8 73.3
100 0.023 84.1 80.2
100 0.026 65.5 66.1
100 0.028 122 135.8
100 0.031 99.9 230.7
100 0.034 95.7 111.5
100 0.037 127.3 74.3
100 0.04 140.7 543.1
The first Data is an example of a Road. The second subset of data are the values I need to query out every 52.8 ft.
Thank you

You could group the data in 52.8 feet blocks. One way to do that is to divide the distance by 52.8, and round that to a whole number. That way, 25 belongs to group 1, 100 belongs to group 2, 110 belongs to group 3, and so on.
In SQL Server, you'd write this like:
select
52.8 * cast(dist/52.8 as int) as Distance
, avg(value1)
, avg(value2)
from YourTable
group by cast(dist/52.8 as int)
Below is an example with your data. Because the data runs from 0 to 0.04, I've made it calculate averages per 0.01 feet block:
declare #Road table (RecID int, Begin_MP float, End_MP float)
insert into #Road select 100, 0, 0.56
declare #Values table (RecID int, MP float, Value1 float, Value2 float)
insert into #Values values
(100, 0 , 159 , 127.7),
(100, 0.003, 95.3 , 115.3),
(100, 0.006, 82.3 , 107),
(100, 0.009, 56.5 , 74.5),
(100, 0.011, 58.1 , 89.1),
(100, 0.014, 95.2 , 78.8),
(100, 0.017, 108.9, 242.5),
(100, 0.02 , 71.8 , 73.3),
(100, 0.023, 84.1 , 80.2),
(100, 0.026, 65.5 , 66.1),
(100, 0.028, 122 , 135.8),
(100, 0.031, 99.9 , 230.7),
(100, 0.034, 95.7 , 111.5),
(100, 0.037, 127.3, 74.3),
(100, 0.04 , 140.7, 543.1);
select
r.RecID
, cast(v.MP/0.01 as int)*0.01 as StartMP
, AVG(v.Value1) as AvgVal1
, AVG(v.Value2) as AvgVal2
from #Road as r
left join #Values as v
on r.RecID = v.RecID
group by r.RecID, cast(v.MP/0.01 as int)
This prints:
RecID StartMP AvgVal1 AvgVal2
100 0.00 98,275 106,125
100 0.01 87,4 136,8
100 0.02 85,85 88,85
100 0.03 107,63 138,83
100 0.04 140,7 543,1

Related

Pivoting a data with Nvarchar and Float/integers

I have a set of data(below) that i am trying to get a pivot out summing the quantity and aggregating the columns with a desired result as shown below. Is there an easy way to do this?
Location1 Location2 Product Mode Customer Quantity
61 151 A TL Bill 800
61 151 A TL Bill 800
61 501 B TL Nan 800
61 501 C TL Cas 800
61 901 B TL Cas 800
61 901 B TL Cas 800
61 111 C TL Bill 800
Desired Result:
Location1 Location2 Product Mode Customer Quantity**
61 151 A TL Bill 1600
61 501 B TL Nan 800
61 501 C TL Cas 800
61 501 C TL Cas 1600
61 111 C TL Bill 800
Since I'm a little bored, i'll do your work for you.
Declare #data as Table(Location1 int, Location2 int, Product varchar(1), Mode varchar(2), Customer varchar(5), Quantity float)
Insert into #data(Location1, Location2, Product, Mode, Customer, Quantity)
Values
(61, 151, 'A', 'TL', 'Bill',800 )
, (61, 151, 'A', 'TL', 'Bill',800 )
, (61, 501, 'B', 'TL', 'Nan',800 )
, (61, 501, 'C', 'TL', 'Cas',800 )
, (61, 901, 'B', 'TL', 'Cas',800 )
, (61, 901, 'B', 'TL', 'Cas',800 )
, (61, 111, 'C', 'TL', 'Bill', 800)
SELECT
Location1, Location2, Product, Mode, Customer, Sum(Quantity) as TotalQty
FROM
#Data
GROUP BY
Location1, Location2, Product, Mode, Customer
ORDER BY
Product
And the Results:
(7 row(s) affected)
Location1 Location2 Product Mode Customer TotalQty
----------- ----------- ------- ---- -------- -----------
61 151 A TL Bill 1600
61 501 B TL Nan 800
61 901 B TL Cas 1600
61 111 C TL Bill 800
61 501 C TL Cas 800
(5 row(s) affected)

Filtering to non-overlapping Ranges - Amazon RedShift

Background: I have ranges that are often updated to set prices for different amounts of materials. Once certain quotas are met, the prices are dropped. The problem is identifying the current prices after ranges have been updated or added.
I am looking to filter out non continuous ranges from a data set. Here is some test code:
drop table if exists public.test_ranges;
create table public.test_ranges (
category integer
,lower_bound integer
,upper_bound integer
,cost numeric(10,2)
,modifieddate timestamp
);
insert into public.test_ranges values (1,0,70456,0,'2015-09-29');
insert into public.test_ranges values (1,53956,60000,1.28,'2015-02-11');
insert into public.test_ranges values (1,70456,90000,1.02,'2015-09-29');
insert into public.test_ranges values (1,90000,120000,0.88,'2015-02-11');
insert into public.test_ranges values (1,120000,999999999,0.79,'2015-02-11');
insert into public.test_ranges values (2,0,48786,0,'2015-11-02');
insert into public.test_ranges values (2,22500,25000,0.43,'2015-02-17');
insert into public.test_ranges values (2,48786,50000,0.37,'2015-11-02');
insert into public.test_ranges values (2,50000,100000,0.21,'2015-02-17');
insert into public.test_ranges values (2,100000,175000,0.19,'2015-02-17');
insert into public.test_ranges values (2,175000,999999999,0.17,'2015-02-17');
insert into public.test_ranges values (3,0,585969,0,'2015-11-02');
insert into public.test_ranges values (3,346667,375000,0.15,'2014-09-12');
insert into public.test_ranges values (3,375000,500000,0.14,'2014-09-12');
insert into public.test_ranges values (3,500000,600000,0.13,'2014-09-12');
insert into public.test_ranges values (3,585969,999999999,0.02,'2015-11-02');
insert into public.test_ranges values (3,600000,670000,0.12,'2014-09-12');
select * from public.test_ranges order by 1,2;
This code will return:
category lower_bound upper_bound cost modifieddate
--------------------------------------------------
1 0 70456 0 2015-09-29
1 53956 60000 1.28 2015-02-11
1 70456 90000 1.02 2015-09-29
1 90000 120000 0.88 2015-02-11
1 120000 999999999 0.79 2015-02-11
2 0 48786 0 2015-11-02
2 22500 25000 0.43 2015-02-17
2 48786 50000 0.37 2015-11-02
2 50000 100000 0.21 2015-02-17
2 100000 175000 0.19 2015-02-17
2 175000 999999999 0.17 2015-02-17
3 0 585969 0.00 2015-11-02
3 346667 375000 0.15 2014-09-12
3 375000 500000 0.14 2014-09-12
3 500000 600000 0.13 2014-09-12
3 585969 999999999 0.02 2015-11-02
3 600000 670000 0.12 2014-09-12
The desired result:
category lower_bound upper_bound cost modifieddate
--------------------------------------------------
1 0 70456 0 2015-09-29
1 70456 90000 1.02 2015-09-29
1 90000 120000 0.88 2015-02-11
1 120000 999999999 0.79 2015-02-11
2 0 48786 0 2015-11-02
2 48786 50000 0.37 2015-11-02
2 50000 100000 0.21 2015-02-17
2 100000 175000 0.19 2015-02-17
2 175000 999999999 0.17 2015-02-17
3 0 585969 0.00 2015-11-02
3 585969 999999999 0.02 2015-11-02
Thanks in advance for any help.
You can't do it perfectly without recursive common table expressions. They are not supported currently in Redshift.
Partial solution (won't give you correct results for category 3):
select tr1.*
from public.test_ranges tr1
left join public.test_ranges tr_left on tr1.category = tr_left.category and tr1.lower_bound = tr_left.upper_bound
left join public.test_ranges tr_right on tr1.category = tr_right.category and tr_right.lower_bound = tr1.upper_bound
where tr1.lower_bound = 0 or tr1.upper_bound = 999999999 or (tr_left.upper_bound is not null and tr_right.lower_bound is not null)
order by tr1.category, tr1.lower_bound;

Group rows by condition

I have this data:
Start End Quantity
425 449 24
450 474 24
475 499 24
500 524 24
2300 2324 24
2400 2499 99
2500 2599 99
2800 2899 99
2900 2999 99
3200 3249 49
3250 3299 49
3300 3349 49
3350 3399 49
3400 3449 49
3500 3549 49
3600 3624 24
3650 3674 24
3700 3724 24
3950 3964 14
4000 4000 0
4150 4399 249
4400 4499 99
5034 5075 41
Quantity is a result of End - Start.
I would like to obtain the following data, the Generated rows:
Start End Quantity
425 449 24
450 474 24
475 499 24
500 524 24
425 524 96
2300 2324 24
2300 2324 24
2400 2499 99
2500 2599 99
-----GENERATED----
425 2599 438
------------------
2800 2899 99
2900 2999 99
3200 3249 49
3250 3299 49
3300 3349 49
3350 3399 49
3400 3449 49
3500 3549 49
-----GENERATED-----
2800 3549 492
------------------
3600 3624 24
3650 3674 24
3700 3724 24
3950 3964 14
4000 4000 0
4150 4399 249
4400 4499 99
5034 5075 41
-----GENERATED-----
3600 5075 475
------------------
The condition is that it has to sum all the quantities until 500. If it passes 500 do a new count.
I have tried with Rollup but I couldnt find the right condition to make it work.
Of course, this is way easier to do by programming code instead of SQL, but we must do it in database environment. The tools to get the generated rows can be anything, looping functions, new tables etc.
Error solving
I got into an error while running #Prdp's query:
Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
I found the solution here:
http://sqlhints.com/tag/the-statement-terminated-the-maximum-recursion-100-has-been-exhausted-before-statement-completion/
Update 1
Using #Prdp's query we got the following:
Start End rn st
(400) 424 1 24
425 449 2 48
450 474 3 72
475 499 4 96
500 524 5 120
2300 2324 6 144
2400 2499 7 243
2500 2599 8 342
2800 (2899) 9 (441)
(2900) 2999 10 99
3200 3249 11 148
3250 3299 12 197
3300 3349 13 246
3350 3399 14 295
3400 3449 15 344
3500 3549 16 393
3600 3624 17 417
3650 3674 18 441
3700 3724 19 465
3950 3964 20 479
4000 (4000) 21 (479)
(4150) 4399 22 249
4400 4499 23 348
5034 (5075) 24 (389)
Its getting closer to what we need. Would it be possible to extract only the data in between ( and ) while discarding the other data?
We can use cursors too.
You can use Recursive CTE. I can't think of any better way.
;WITH cte
AS (SELECT *,
Row_number()OVER(ORDER BY start) rn
FROM Yourtable),
rec_cte
AS (SELECT *,
( [End] - Start ) AS st,
1 AS grp
FROM cte
WHERE rn = 1
UNION ALL
SELECT a.*,
CASE
WHEN st + ( a.[End] - a.Start ) >= 500 THEN a.[End] - a.Start
ELSE st + ( a.[End] - a.Start )
END,
CASE
WHEN st + ( a.[End] - a.Start ) >= 500 THEN b.grp + 1
ELSE grp
END
FROM cte a
JOIN rec_cte b
ON a.rn = b.rn + 1)
SELECT Min(Start) as Start,
Max([End]) as [End],
Max(st) as Quantity
FROM rec_cte
GROUP BY grp
OPTION (maxrecursion 0)
Here is a proposed solution in MySQL. A similar strategy should work in SQL Server.
drop table if exists TestData;
create table TestData(Start int, End int, Quantity int);
insert TestData values (425,449,24);
insert TestData values (450,474,24);
insert TestData values (475,499,24);
insert TestData values (500,524,24);
insert TestData values (2300,2324,24);
insert TestData values (2400,2499,99);
insert TestData values (2500,2599,99);
insert TestData values (2800,2899,99);
insert TestData values (2900,2999,99);
insert TestData values (3200,3249,49);
insert TestData values (3250,3299,49);
insert TestData values (3300,3349,49);
insert TestData values (3350,3399,49);
insert TestData values (3400,3449,49);
insert TestData values (3500,3549,49);
insert TestData values (3600,3624,24);
insert TestData values (3650,3674,24);
insert TestData values (3700,3724,24);
insert TestData values (3950,3964,14);
insert TestData values (4000,4000,0);
insert TestData values (4150,4399,249);
insert TestData values (4400,4499,99);
insert TestData values (5034,5075,41);
drop table if exists DataRange;
create table DataRange (StartRange int, EndRange int);
insert DataRange values (425, 2599);
insert DataRange values (2800,3549);
insert DataRange values (3600,5075);
select
DataRange.StartRange,DataRange.EndRange
,sum(TestData.quantity) as Quantity
from TestData
inner join DataRange on
(TestData.start between DataRange.StartRange and DataRange.EndRange )
or
(TestData.End between DataRange.StartRange and DataRange.EndRange )
group by DataRange.StartRange,DataRange.EndRange

T-SQL Pivot by 2 fields and grouping

I am looking to pivot the rows in the attached image and I want the output to look something like this
ID Age Factor
1 30 8.650
1 35 11.52
1 40 13.87
till 100
2 30 7.99
2 35 10.98
2 40 13.43
till 100
3 30 7.32
3 35 10.98
3 40 13.43
till 100
and so on until i reach the last row (81) in the attached data source.
Thank you :)
Source data
Finally got it working -
SELECT a.ID - 1 AS ID, b.Age, CAST(b.Factor AS DECIMAL(19,2)) AS Factor
from t1 a -- data source table.
cross apply (values
(30, F1),
(35, F2),
(40, F3),
(45, F4),
(50, F5),
(55, F6),
(60, F7),
(65, F8),
(70, F9),
(100, F10)
) b(Age, Factor)
where a.ID >= 2

How to create a query on an existing table and build a table(view) with aggregated data and a restriction?

What I have is an MS-SQL database that I use to store data/info coming from equipment that is mounted in some vehicles (1-3 devices per vehicle).
For the moment, there is a table in the database named DeviceStatus - a big table used to store every information from the equipment when they connect to the TCP-server. Records are added (sql INSERT) or updated (sql UPDATE) here.
The table looks like this:
Sample data:
1040 305 3 8.00 0
1044 305 2 8.00 0
1063 305 1 8.01 1.34
1071 312 2 8.00 0
1075 312 1 8.00 1.33
1078 312 3 8.00 0
1099 414 3 8.00 0
1106 414 2 8.01 0
1113 102 1 8.01 1.34
1126 102 3 8.00 0
Remark: The driver console is always related to the device installed on first position (it's an extension of Device on Position 1; obvioulsly there's only one console per vehicle) - so, this will be some sort of restriction in order to have the correct info in the desired table(view) presented below :).
What I need is a SQL query (command/statement) to create a table(view) for a so-called "Software Versions Table", where I can see the software version for all devices installed in vehicles (all that did connect and communicate with the server)... something like the table below:
Remark: Device#1 for 414 is missing because it didn't communicate (not yet I guess...)
With the information we have so far, I think you need a query with a PIVOT:
SELECT P.VehicleNo, V.DriverConsoleVersion, P.[1] AS [Device1SwVersion], P.[2] AS [Device1SwVersion], P.[3] AS [Device1SwVersion]
FROM (
SELECT VehicleNo, [1], [2], [3]
FROM (
SELECT VehicleNo, DevicePosition, DeviceSwVersion
FROM #DeviceInfo
) as d
PIVOT (
MAX(DeviceSwVersion)
FOR DevicePosition IN ([1], [2], [3])
) PIV
) P
LEFT JOIN #DeviceInfo V
ON V.VehicleNo = P.VehicleNo AND V.DevicePosition = 1;
You can create a view with such a query.
The first subquery get 4 column for Device 1 to 3 for each vehicle.
It then LEFT JOIN it with the SwVersion table in order to get the Console version associated with Device 1.
Output:
VehicleNo DriverConsoleVersion Device1SwVersion Device1SwVersion Device1SwVersion
102 1.34 8.01 NULL 8.00
305 1.34 8.01 8.00 8.00
312 1.33 8.00 8.00 8.00
414 NULL NULL 8.01 8.00
Your data:
Declare #DeviceInfo TABLE([DeviceSerial] int, [VehicleNo] int, [DevicePosition] int, [DeviceSwVersion] varchar(10), [DriverConsoleVersion] varchar(10));
INSERT INTO #DeviceInfo([DeviceSerial], [VehicleNo], [DevicePosition], [DeviceSwVersion], [DriverConsoleVersion])
VALUES
(1040, 305, 3, '8.00', '0'),
(1044, 305, 2, '8.00', '0'),
(1063, 305, 1, '8.01', '1.34'),
(1071, 312, 2, '8.00', '0'),
(1075, 312, 1, '8.00', '1.33'),
(1078, 312, 3, '8.00', '0'),
(1099, 414, 3, '8.00', '0'),
(1106, 414, 2, '8.01', '0'),
(1113, 102, 1, '8.01', '1.34'),
(1126, 102, 3, '8.00', '0')
;
I like the PIVOT answer, but here is another way:
select VehicleNo,
max(DriverConsoleVersion) DriverConsoleVersion,
max(case when DevicePosition = 1 then DeviceSwVersion end) Device1SwVersion,
max(case when DevicePosition = 2 then DeviceSwVersion end) Device2SwVersion,
max(case when DevicePosition = 3 then DeviceSwVersion end) Device3SwVersion
from #DeviceInfo
group by VehicleNo
order by VehicleNo
You can also do casting or formatting on them. So one might be:
select ...,
isnull(cast(cast(
max(case when DevicePosition = 1 then DeviceSwVersion end)
as decimal(8,2)) / 100) as varchar(5)), '') Device1SwVersion,