Allocate groups by size, preliminarily rounded and grouped

Allocate groups by size, preliminarily rounded and grouped - sql

You are given a database of notebooks that contains two tables.
the table notebooks\brand contains data about the name of notebook brands.
the table notebooks\notebook contains data about the name of the notebook, its diagonal, width, depth, and height, and has a link to the brand to which this model belongs.
You need to select groups of notebooks by size. To do this, the size should first be rounded up to the nearest 0 or 5 and then grouped by the same size by counting the number of laptops in each group. Sort the data by size.
I Wrote a query that calculates how many laptops are represented in each brand:
cursor.execute("""SELECT brnd.title,
COUNT(brnd.id)
FROM notebooks_notebook AS ntbk
JOIN notebooks_brand AS brnd
ON ntbk.brand_id = brnd.id
GROUP BY brnd.title """)
('HP', 225)
('Prestigio', 1)
('Huawei', 6)
('ASUS', 223)
('Haier', 2)
('Xiaomi', 13)
('MSI', 34)
('HONOR', 15)
('Gigabyte', 5)
('Digma', 4)
('Lenovo', 253)
('Dell', 75)
('Acer', 82)
('Chuwi', 4)
('Apple', 55)

Postgres does integer division. Assuming that your size columns are defined as integers, we can round to the nearest 5 with an expression like :
width / 5 * 5
We can apply this logic to your query ; starting from your existing joins, we can compute the rounded values in a lateral join, then aggregate
select x.width, x.depth, x.height, count(*) cnt
from notebooks_notebook n
inner join notebooks_brand as b on n.brand_id = b.id
cross join lateral (values (width / 5 * 5, depth / 5 * 5, height / 5 * 5)) x(width, depth, height)
group by x.width, x.depth, x.height

Related

how to divide by subset of table by itself, i.e. normalisation (t!=0 rows by t=0 rows)

I'm essentially looking for a way to normalise (t/t0) a bunch of measurements across timepoints with a specific timepoint, e.g. timepoint=0.
I have a table as following:
coordinate,timepoint,quantity
A1,0,50
B2,0,10
C3,0,60
A2,0,20
F1,0,20
A1,1,100
B2,1,150
C3,1,120
A2,1,140
F1,1,160
A1,4,100
B2,4,80
C3,4,80
A2,4,100
F1,4,120
I want to make a table that divides all the other non-zero timepoint rows by the 0 timepoint rows where the coordinates match, i.e. A1-t1 / A1-t0, A1-t4 / A1-t0, B2-t1 / B2-t0, B2-t0 / B2-t4 etc. etc. for wherever there is a join on coordinate available.
The result would be like:
coordinate,timepoint,quantity
A1,0,1
B2,0,1
C3,0,1
A2,0,1
F1,0,1
A1,1,2
B2,1,15
C3,1,2
A2,1,7
F1,1,8
etc.
Something like this mostly works...
select t0.coordinate,t0.quantity,tother.quantity,tother.quantity/t0.quantity as tnorm
(select * from table
where timepoint != 0) as tother
LEFT JOIN (select * from table
where timepoint = 0) as t0
ON (t1.coordinate = t2.coordinate);
Though I ideally would like to have a pivot of the table could be displayed where each column is each normalisation, e.g. columns as
coordinate, t0/t0, t1/t0, t4/t0 etc.
A1,1,2,value etc.
B2,1,15,value etc.
C3,1,2,value etc.
A2,1,7,value etc.
F1,1,8,value etc.
...though this might not be possible and must be done in postprocessing (e.g. pandas pivot).
I couldn't work out the right syntax for this one - any help is appreciated.

WITH t1 AS (
SELECT position, quantity
FROM table
WHERE timepoint = 0
)
SELECT t2.position, t2.timepoint, (t2.quantity/t1.quantity) quantity
FROM table t2
INNER JOIN t1 ON t2.position=t1.position

PostgreSQL - left join generate_series() and table

I use generate series to create a series of numbers from 0 to 200.
I have a table that contains dirtareas in mm² in a column called polutionmm2. What I need is to left join this table to the generated series, but the dirt area must be in cm² so /100. I was not able to make this work, as I can't figure out how I can connect a table to a series that has no name.
This Is what I have so far:
select generate_series(0,200,1) as x, cast(p.polutionmm2/100 as char(8)) as metric
from x
left join polutiondistributionstatistic as p on metric = x
error: relation X does not exist
Here is some sample data: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=3d7d851887adb938819d6cf3e5849719
what I would need, is the first column (x) counting all the way from 0 to 200, and where there is a matching value, to show it in the second column.
Like this:
x, metric
0, 0
1, 1
2, 2
3, null
4, 4
5, null
... , ...
... , ...
200, null

You can put generate_series() in the FROM. So, I think you want something like this:
select gs.x, cast(p.polutionmm2/100 as char(8)) as metric
from generate_series(0,200,1) gs(x) left join
p
on gs.x = (p.polutionmm2/100);
I imagine there is also more to your query, because this doesn't do much that is useful.

Is there a way to compare Lat/long of two tables

I have two tables:
address_points
kmldata
address_points table columns: ID address Latitude1 Longitude2
kmldata table columns: Locname Lat Long
Now I want to see all the records of the address_points table whose Latitude1 and Longitude2 values fall in the range of Lat and Long values of kmldata table.
I have not handled comparison of locations before in the SQL server so don't know which function I can use here. I thought of the BETWEEN operator but can seem to use it here properly here. Any guidance on how I can acheive this?

You need to use the spatial functions within SQL Server. First you will need to aggregate all the points in your kmldata table to create an area, and then use STWithin to check which points fall within that area:
declare #kmlarea geography
select #kmlarea = geography::EnvelopeAggregate(geography::Point(Lat, Long, 4326))
from kmldata
select *
from address_points a
where geography::Point(Latitude1, Longitude2, 4326).STWithin(#kmlarea) = 1

There are several ways of calculating the geographical distance between two sets of coordinates. Already posted is the built in Geography method. There are also several good "home grown" functions based on a round earth model.
The hard part is making the actual comparison when you have large row counts in your source and destination tables. Comparing every source to every destination creates an unnecessarily large Cartesian product. I say "unnecessarily" because there's no point in calculating the distance between a source in Florida and a destination in California when I'm only interested in destinations withing 15 miles of the source.
To solve the problem, I created a "bounding box" function that calculates a square(ish) box around a single set of coordinates. The code is posted below...
CREATE FUNCTION dbo.tfn_LatLngBoundingBox
/* ===================================================================
12/03/2019 JL, Created: Calculates the bounding box for a given set of Lat/Lng coordinates.
=================================================================== */
--===== Define I/O parameters
(
#Lat DECIMAL(8,5),
#Lng DECIMAL(8,5),
#MaxDistance DECIMAL(8,3),
#DistanceUnit CHAR(1) -- 'M'=miles ; 'K'=kilometers
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT
MinLat = CONVERT(decimal(8, 5), (x.MinLat / PI()) * 180),
MaxLat = CONVERT(decimal(8, 5), (x.MaxLat / PI()) * 180),
MinLng = CONVERT(decimal(8, 5), (y.MinLng / PI()) * 180),
MaxLng = CONVERT(decimal(8, 5), (y.MaxLng / PI()) * 180)
FROM
( VALUES (
CASE
WHEN #DistanceUnit = 'K' THEN #MaxDistance / 6366.707019 -- Earth sphere radius in kilometers
WHEN #DistanceUnit = 'M' THEN (#MaxDistance * 1.609344) / 6366.707019
END,
(#Lat / 180) * PI(),
(#Lng / 180) * PI()
) ) r (DistRad, rLat, rLng)
CROSS APPLY ( VALUES (r.rLat - r.DistRad, r.rLat + r.DistRad) ) x (MinLat, MaxLat)
CROSS APPLY ( VALUES (ASIN(SIN(r.rLat) / COS(r.DistRad))) ) lt (LatT) -- = 1.4942 rad
CROSS APPLY ( VALUES (ACOS( ( COS(r.DistRad) - sin(lt.LatT) * sin(r.rLat) ) / ( cos(lt.LatT) * cos(r.rLat) ) ) ) ) dl (DeltaLng)
CROSS APPLY ( VALUES (r.rLng - dl.DeltaLng, r.rLng + dl.DeltaLng) ) y (MinLng, MaxLng);
GO
The use case looks like the following...
SELECT
s.Lat,
s.Lng,
d.Lat,
d.Lng,
dm.DistanceInMiles
FROM
dbo.[Source] s
CROSS APPLY dbo.tfn_LatLngBoundingBox(s.Lat, s.Lng, 15, 'M') bb
LEFT JOIN dbo.Destination d
ON d.lat BETWEEN bb.MinLat AND bb.MaxLat
AND d.Lng BETWEEN bb.MinLng AND bb.MaxLng
CROSS APPLY dbo.tfn_LatLonDistanceInMiles(s.Lat, s.Lng, d.Lat, d.Lng) dm
WHERE
dm.DistanceInMiles <= 15;

find the maximum in a column, but only when two other columns match

I need help in PostgreSQL.
I have two tables
Predicton - predicts future disasters and casualties for each city.
Measures fits the type of damage control providers for each type of disaster (incl. cost and percent of "averted casualties")
Each disaster and provider combination has an amount of averted casualties (the percent from measures * amount of predicted casualties for that disaster*0.01).
For each combination of city and disaster, I need to find two providers that
1) their combined cost is less than a million
2) have the biggest amount of combined averted casualties.
My work and product so far
select o1.cname, o1.etype, o1.provider as provider1, o2.provider as provider2, (o1.averted + o2.averted) averted_casualties
from (select cname, m.etype, provider, mcost, (percent*Casualties*0.01)averted
from measures m, prediction p
where (m.etype = p.etype)) as o1, (select cname, m.etype, provider, mcost, (percent*Casualties*0.01)averted
from measures m, prediction p
where (m.etype = p.etype)) as o2
where (o1.cname = o2.cname) and (o1.etype = o2.etype) and (o1.provider < o2.provider) and (o1.mcost + o2.mcost < 1000000)
How do I change this query so It Will show me the best averted_casualties for each city/disaster combo (not just max of all table, max for each combo)
This is the desired outcome:
P.S. I'm not allowed to use ordering, views or functions.

First, construct all pairs of providers and do the casualty and cost calculation:
select p.*, m1.provider as provider_1, m2.provider as provider_2,
p.casualties * (1 - m1.percent / 100.0) * (1 - m2.percent / 100.0) as net_casualties,
(m1.mcost + m2.mcost) as total_cost
from measures m1 join
measures m2
on m1.etype = m2.etype and m1.provide < m2.provider join
prediction p
on m1.etype = p.etype;
Then, apply your conditions. Normally, you would use window functions, but since ordering isn't allowed for this exercise, you want to use a subquery:
with pairs as (
select p.*, m1.provider as provider_1, m2.provider as provider_2,
p.casualties * (1 - m1.percent / 100.0) * (1 - m2.percent / 100.0) as net_casualties,
(m1.mcost + m2.mcost) as total_cost
from measures m1 join
measures m2
on m1.etype = m2.etype and m1.provide < m2.provider join
prediction p
on m1.etype = p.etype;
)
select p.*
from pairs p
where p.total_cost < 1000000 and
p.net_casualties = (select min(p2.net_casualties)
from pairs p2
where p2.city = p.city and p2.etype = p.etype and
p2.total_cost < 1000000
);
The biggest number of averted casualties results in the smallest number of net casualties. They are the same thing.
As for your attempted solution. Just seeing the , in the from clause tells me that you need to study up on join. Simple rule: Never use commas in the from clause. Always use proper, explicit, standard join syntax.
Your repeated subqueries also suggest that you need to learn about CTEs.

SQLite3 table subtraction

I have two tables; one is a reference table all_grid and the other has customer details on it t_customer.
I need to present the rows that are in the reference table but not in the customer table (i.e. show the rows where customer_x and customer_y columns are present in the all_grid but not in t_customer). Columns are named the same in both tables, but t_customer has an id column too.
Currently I've tried
SELECT customer_x, customer_y FROM all_grid
EXCEPT
SELECT customer_x, customer_y FROM t_customer;
but this seems to just show all rows in all_grid and I'm not sure which terminology to use for SQLite.
t_customer table is as follows:
1|35|24
2|-20|30
3|-10|-20
4|35|-46
5|4|-19
6|30|36
7|-12|-24
8|-12|-16
9|-17|-10
10|99|99
11|-4|-29
12|35|24
13|13|28
14|99|99
15|-24|-3
16|-49|-39
17|99|99
18|-48|-44
19|-46|35
20|-28|-47
21|99|99
22|99|99
23|31|22
24|4|14
25|5|6
26|32|24
27|-34|-4
28|29|25
29|-12|-31
30|99|99
31|-17|41
32|-20|-42
33|99|99
34|-4|40
and all_grid is all 100 possible mixes of customer_x and customer_y rounded down to the nearest 10, and (90, 90) included.

Apologies, I've realised that I did something stupid and didn't consider the fact that t_customer data was rounded up and down to 10 in prior calculations.
My final query was:
SELECT (customer_x * 10), (customer_y * 10) FROM all_grid
EXCEPT
SELECT 10 * (t.customer_x / 10), 10 * (t.customer_y / 10) FROM
(SELECT CASE WHEN customer_x < 0 THEN customer_x - 10 ELSE customer_x END AS customer_x,
CASE WHEN customer_y < 0 THEN customer_y - 10 ELSE customer_y END AS customer_y
FROM t_customer) t
and seems to do the trick.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Allocate groups by size, preliminarily rounded and grouped - sql

Related

how to divide by subset of table by itself, i.e. normalisation (t!=0 rows by t=0 rows)

PostgreSQL - left join generate_series() and table

Is there a way to compare Lat/long of two tables

find the maximum in a column, but only when two other columns match

SQLite3 table subtraction

Categories

Resources