Count number of points within different ranges SQL - sql

We have real estate point X.
We want to calculate the number of stations within
0-200 m
200-400 m
400-600 m
After i have this I will later create a new table where these are summarized according to mathematical expressions.
SELECT loc_dist.id, loc_dist.namn1, grps.grp, count(*)
FROM (
SELECT b.id, b.namn1, ST_Distance_Sphere(b.geom, s.geom) AS dist
FROM stations s, bostader b) AS loc_dist
JOIN (
VALUES (1,200.), (2,400.), (3,600.)
) AS grps(grp, dist) ON loc_dist.dist < grps.dist
GROUP BY 1,2,3
ORDER BY 1,2,3;
I have this now, but it takes forever to run and can't get any results since I have more than 2000 entries from both b and s, I want number of s from a specific b. But this calculates for all, how do I add a:
WHERE b.id= 114477
for example? I only get syntax error on the join when I try to do this, I only want group distances from one or maybe 5 different b, depending on their b.id

After a lot of help from TA, the answer is here and works nicely, added ranges and a BETWEEN clause to get count within the circle rings
SELECT loc_dist.id, loc_dist.namn1, grps.grp, count(*)
FROM (
SELECT b.id, b.namn1, ST_Distance_Sphere(b.geom, s.geom) AS dist
FROM stations s, bostader b WHERE b.id=114477) AS loc_dist
JOIN (
VALUES (1,0,200), (2,200,400), (3,400,600)
) AS grps(grp, dist_l, dist_u) ON loc_dist.dist BETWEEN dist_l AND dist_u
GROUP BY 1,2,3
ORDER BY 1,2,3;

Related

SQL / BigQuery- How to group by chain of pairs in different columns?

I am trying to aggregate rows on a chain of pairs. For example, given table:
pair0 | pair1
a z
b a
c b
d b
m n
z y
Returns:
matches
[a, z, b, c, d, y]
[m, n]
where order of the pairs doesn't matter.
I've tried joining the table on itself, but am unable to aggregate in this way without putting the join in the loop, for the max number of possible combinations.
SELECT
[a.pair0, a.pair1, b.pair0, b.pair1] as matches
FROM pairs a
LEFT JOIN pairs b
ON a.pair0 = b.pair1
GROUP BY
matches
and then would filter matches for distinct. but, this solution only works if the chain is limited to two rows. In the example above, the chain extends for 5 rows. Grouping by an array also is not allowed.
BigQuery supports recursive queries: yay! With this feature at hand, we can try and solve this graph-walking problem.
The idea is to start by generating all possible edges, and then recursively query traverse the graph, while taking care of not visiting the same node twice. Once all paths are visited, we can identify to which group each node belongs by looking at the aggregated list of its visited nodes.
with recursive
edges as (
select x.pair0, x.pair1
from pairs p
cross join lateral (
select p.pair0, p.pair1 union all select p.pair1, p.pair0
) x(pair0, pair1)
),
cte as (
select pair0, pair1, [ pair1 ] visited
from edges
union all
select c.pair0, e.pair1, array_concat(c.visited, [ e.pair1 ] )
from cte c
inner join edges e on e.pair0 = c.pair1 and e.pair1 not in unnest(c.visited)
),
res as (
select pair0, array_agg(distinct pair1 order by pair1) grp
from cte
group by pair0
)
select distinct grp from res

Count Distinct values in one column based on other column

I am trying to count distinct values on Z_l based on value by using with clause. Sample data exercise included below.
please look at the picture, the distinct values of Z_l based on X='ny'
with distincz_l as (select ny.X, ny.z_l o.cnt From HOPL ny join (select X, count(*) as cnt from HOPL group by X) o on (ny.X = o.Z_l)) select * from HOPL;
You don't even need a WITH clause, since you just need one single sentence:
SELECT z_l, count(1)
FROM hopl
WHERE x='ny'
GROUP BY z_l
;

Number of rows returned for each point PostGIS

I have two datasets of points. I want to calculate the number of points in table B within 2km of each location in table A and store this as an integer column.
I use the code below to return each point from A within 2km of B, but I am unsure how to extract the number of occurrences of A in this result and store it to a new column.
select a.name, b.incident_zip
from ny_air_bnb as a, nypubs as b
where(st_transform(a.geom,2263) <-> st_transform(b.geom,2263) <= 2000)
Perhaps you just want window functions:
select a.name, b.incident_zip, count(*) over (partition by a.name)
from ny_air_bnb a join
nypubs as b
on (st_transform(a.geom,2263) <-> st_transform(b.geom,2263) <= 2000);
EDIT:
If you want to store this in ny_air_bnb, you need update:
update ny_air_bnb a
set col = (select count(*)
from nypubs p
where st_transform(a.geom, 2263) <-> st_transform(p.geom, 2263) <= 2000
);

How to select rows m through n in access?

I am modifying a query for a sub-report in Access 2016 and need to select a set of rows, not all rows. By default the query generated looks like:
SELECT table_name.a, table_name.b, table_name.c
FROM table_name
WHERE (((table_name.dist_ft)<3001));
How can I select only rows m through n instead of all rows?
Thanks for your insights! ... [edit]
An additional clarification - when I run the query like
SELECT TOP 16 *
FROM table_name
WHERE (((table_name.dist_ft)<3001));
... or any other variation I've tried with TOP my sub-report does not get populated. It only contains data when all fields are selected and TOP is not used. I must be missing something.
Is ID your "row number" here?
SELECT table_name.a, table_name.b, table_name.c
FROM table_name
WHERE table_name.dist_ft<3001
AND table_name.ID>=m
AND table_name.ID<=n
;
Updated with more general case based on comments-
Select table_name.a, table_name.b, table_name.c from tablename
where tablename.id in
(select top n tablename.id from tablename)
and tablename.id not in
(select top m tablename.id from tablenane)
Records m to n are the records 1 to n minus the records 1 to m-1. Be aware though that you need an ORDER BY clause for a TOP clause to make sense.
Here is an example with m = 31 to n = 40 and an order by all three selected columns. MS Access does not support EXCEPT so we cannot subtract the two data sets, which would be the straight-forward way to go. We could also express the desired result as the top n where (a,b,c) not in top m-1, but MS Access does not support an IN clause on multiple columns either. So I am using an anti join here (for which I select dist_ft, but it can be any non-nullable column of the table).
In case your table has a unique ID column, you can use a more readable where (id) not in (select top 30 id ...) instead of an anti join. In any way make sure to apply the same WHERE clause (dist_ft < 3001 in your case) and ORDER BY clause (e.g. ORDER BY a, b, c) to the main query and subquery.
SELECT TOP 40 a, b, c
FROM table_name t
LEFT JOIN
(
SELECT TOP 30 a, b, c, dist_ft
FROM table_name
WHERE dist_ft < 3001
ORDER BY a, b, c
) no ON no.a = t.a AND no.b = t.b AND no.c = t.c
WHERE t.dist_ft < 3001
AND no.dist_ft is null
ORDER BY t.a, t.b, t.c;
MS Access is known for requiring additional parentheses on multiple joins. I cannot say whether above query works straight away or if parantheses must be added somewhere.
You sort thrice to get the result you are after. Let's say we want rows 31 to 40:
Sort and get top 40
sort in reverse order and get top 10
sort again to get the order you actually want
The query:
SELECT a, b, c
FROM
SELECT TOP 10 a, b, c
FROM
(
SELECT TOP 40 a, b, c
FROM table_name
WHERE t.dist_ft < 3001
ORDER BY a, b, c
) top_n
ORDER BY a desc, b desc, c desc
) top_m_to_n
ORDER BY a, b, c;

SQL percentage of the total

Hi how can I get the percentage of each record over the total?
Lets imagine I have one table with the following
ID code Points
1 101 2
2 201 3
3 233 4
4 123 1
The percentage for ID 1 is 20% for 2 is 30% and so one
how do I get it?
There's a couple approaches to getting that result.
You essentially need the "total" points from the whole table (or whatever subset), and get that repeated on each row. Getting the percentage is a simple matter of arithmetic, the expression you use for that depends on the datatypes, and how you want that formatted.
Here's one way (out a couple possible ways) to get the specified result:
SELECT t.id
, t.code
, t.points
-- , s.tot_points
, ROUND(t.points * 100.0 / s.tot_points,1) AS percentage
FROM onetable t
CROSS
JOIN ( SELECT SUM(r.points) AS tot_points
FROM onetable r
) s
ORDER BY t.id
The view query s is run first, that gives a single row. The join operation matches that row with every row from t. And that gives us the values we need to calculate a percentage.
Another way to get this result, without using a join operation, is to use a subquery in the SELECT list to return the total.
Note that the join approach can be extended to get percentage for each "group" of records.
id type points %type
-- ---- ------ -----
1 sold 11 22%
2 sold 4 8%
3 sold 25 50%
4 bought 1 50%
5 bought 1 50%
6 sold 10 20%
To get that result, we can use the same query, but a a view query for s that returns total GROUP BY r.type, and then the join operation isn't a CROSS join, but a match based on type:
SELECT t.id
, t.type
, t.points
-- , s.tot_points_by_type
, ROUND(t.points * 100.0 / s.tot_points_by_type,1) AS `%type`
FROM onetable t
JOIN ( SELECT r.type
, SUM(r.points) AS tot_points
FROM onetable r
GROUP BY r.type
) s
ON s.type = t.type
ORDER BY t.id
To do that same result with the subquery, that's going to be a correlated subquery, and that subquery is likely to get executed for every row in t.
This is why it's more natural for me to use a join operation, rather than a subquery in the SELECT list... even when a subquery works the same. (The patterns we use for more complex queries, like assigning aliases to tables, qualifying all column references, and formatting the SQL... those patterns just work their way back into simple queries. The rationale for these patterns is kind of lost in simple queries.)
try like this
select id,code,points,(points * 100)/(select sum(points) from tabel1) from table1
To add to a good list of responses, this should be fast performance-wise, and rather easy to understand:
DECLARE #T TABLE (ID INT, code VARCHAR(256), Points INT)
INSERT INTO #T VALUES (1,'101',2), (2,'201',3),(3,'233',4), (4,'123',1)
;WITH CTE AS
(SELECT * FROM #T)
SELECT C.*, CAST(ROUND((C.Points/B.TOTAL)*100, 2) AS DEC(32,2)) [%_of_TOTAL]
FROM CTE C
JOIN (SELECT CAST(SUM(Points) AS DEC(32,2)) TOTAL FROM CTE) B ON 1=1
Just replace the table variable with your actual table inside the CTE.