Counting points/coordinates that lie within a bounding box - sql

I have 2 tables. The first table contains following columns: Start_latitude, start_longitude, end_latitude, end_longitude, sum. The sum column is empty and needs to be filled based on second table.
The second table contains 3 columns: point_latitude, point_longitude
Table 1
-------------------------
|45 | 50 | 46 | 51 | null|
----|---------------------
|45 | 54 | 46 | 57 | null|
--------------------------
Table2:
---------------
| 45.5 | 55.2 |
---------------
| 45.8 | 50.6 |
---------------
| 45.2 | 56 |
---------------
The null values in table1-row1 would be 1 while in row2 it would be 2. It is the count of number of points that lie within the bounding box.
I can do it in python by writing functions to read values between dataframes. How can this be done in Postgresql. This is a sample problem statement that I came up with for my situation.

Update
This version was tested on PostgreSql 9.3 using SQL Fiddle
UPDATE table1 a
SET sum = sub.point_count
FROM (SELECT a.start_lat, a.end_lat, a.start_lon, a.end_lon, COUNT(*) as point_count
FROM table1 a, table2 b
WHERE b.point_lat BETWEEN start_lat AND a.end_lat
AND b.point_lon BETWEEN a.start_lon AND a.end_lon
GROUP BY a.start_lat, a.end_lat, a.start_lon, a.end_lon) as sub
WHERE a.start_lat = sub.start_lat
AND a.end_lat = sub.end_lat
AND a.start_lon = sub.start_lon
AND a.end_lon = sub.end_lon;
Original answer
Here is my solution, it is tested on MySQL but there is nothing specific about this code so it should work on PostgreSql as well
UPDATE table1 a,
(SELECT a.start_lat, a.end_lat, a.start_lon, a.end_lon, COUNT(*) as count
FROM table1 a, table2 b
WHERE b.point_lat BETWEEN start_lat AND a.end_lat
AND b.point_lon BETWEEN a.start_lon AND a.end_lon
GROUP BY a.start_lat, a.end_lat, a.start_lon, a.end_lon) as sub
SET sum = count
WHERE a.start_lat = sub.start_lat
AND a.end_lat = sub.end_lat
AND a.start_lon = sub.start_lon
AND a.end_lon = sub.end_lon
Note that this query would be much shorter if table1 contained a PK Id column.

Related

Calculate overall percentage of Access Query

I have an MS Access Query which returns the following sample data:
+-----+------+------+
| Ref | ANS1 | ANS2 |
+-----+------+------+
| 123 | A | A |
| 234 | B | B |
| 345 | C | C |
| 456 | D | E |
| 567 | F | G |
| 678 | H | I |
+-----+------+------+
Is it possible to have Access return the overall percentage where ANS1 = ANS2?
So my new query would return:
50
I know how to get a count of the records returned by the original query, but not how to calculate the percentage.
Since you're looking for a percentage of some condition being met across the entire dataset, the task can be reduced to having a function return either 1 (when the condition is validated), or 0 (when the condition is not validated), and then calculating an average across all records.
This could be achieved in a number of ways, one example might be to use a basic iif statement:
select avg(iif(t.ans1=t.ans2,1,0)) from YourTable t
Or, using the knowledge that a boolean value in MS Access is represented using -1 (True) or 0 (False), the expression can be reduced to:
select -avg(t.ans1=t.ans2) from YourTable t
In each of the above, change YourTable to the name of your table.
If you know how to get a count, then apply that same knowledge twice:
SELECT Count([ANS1]) As MatchCount FROM [Data]
WHERE [ANS1] = [ANS2]
divided by the total count
SELECT Count([ANS1]) As AllCount FROM [Data]
To combine both of these in a basic SQL query, one needs a "dummy" query since Access doesn't allow selection of only raw data:
SELECT TOP 1
((SELECT Count([ANS1]) As MatchCount FROM [Data] WHERE [ANS1] = [ANS2])
/
(SELECT Count([ANS1]) As AllCount FROM [Data]))
AS MatchPercent
FROM [Data]
This of course assumes that there is at least one row... so it doesn't divide by zero.

Display two linked values for two id field pointing the same table

In my PostgreSQL database I have a table like this one:
id|link1|link2|
---------------
1 | 34 | 66
2 | 23 | 8
3 | 11 | 99
link1 and link2 fields are both pointing to the same table table2 which has id and descr fields.
I would make an SQL query that returns the same row the id and the descr value for the two field like this:
id|link1|link2|desc_l1|desc_l2|
-------------------------------
1 | 34 |66 | bla | sisusj|
2 | 23 | 8 | ghhj | yui |
3 | 11 | 99 | erd | bnbn |
I've try different queries, but everyone returns two rows per id instead of one.
How can I achieve these results in my PostgreSQL 9.04 database?
Normally, this query should work for you. Assume your first table name's table_name.
SELECT t.id, t.link1, t.link2,
l1.descr AS desc_l1,
l2.descr AS desc_l2
FROM table_name t
LEFT JOIN table2 l1
ON t.link1 = l1.id
LEFT JOIN table2 l2
ON t.link2 = l2.id;
you can use case here Like:
select link1,link2,
case
when link1='34' and link2='66' then 'bla'
when link1='23' and link2='8' then 'ghs'
when link1='11' and link2='99' then 'erd'
end as desc_li,
case
when link1='34' and link2='66' then 'sjm'
when link1='23' and link2='8' then 'yur'
when link1='11' and link2='99' then 'bnn'
end as desc_l2
from table1

Join two tables juxtaposing columns with same name sql

I have two sqlite3 tables with same column names and I want to compare them. To do that, I need to join the tables and juxtapose the columns with same name.
The tables share an identical column which I want to put as the first column.
Let's imagine I have table t1 and table t2
Table t1:
SharedColumn | Height | Weight
A | 2 | 70
B | 10 | 100
Table t2:
SharedColumn | Height | Weight
A | 5 | 25
B | 32 | 30
What I want get as a result of my query is :
SharedColumn | Height_1 | Height_2 | Weight_1 | Weight_2
A | 2 | 5 | 70 | 25
B | 10 | 32 | 100 | 30
In my real case i have a lot of columns so I would like to avoid writing each column name twice to specify the order.
Renaming the columns is not my main concern, what interests me the most is the juxtaposition of columns with same name.
There is no way to do that directly in SQL especially because you also want to rename the columns to identify their source, you'll have to use dynamic SQL and honestly? Don't! .
Simply write the columns names, most SQL tools provide a way to generate the select, just copy them and place them in the correct places :
SELECT t1.sharedColumn,t1.height as height_1,t2.height as height_2 ...
FROM t1
JOIN t2 ON(t1.sharedColumn = t2.sharedColumn)+
Try the following query to get the desired result!!
SELECT t1.Height AS Height_1, t1.Weight AS Weight_1, t1.sharedColumn AS SharedColumn
t2.Height AS Height_2, t2.Weight AS Weight_2
FROM t1 INNER JOIN t2
ON t1.sharedColumn = t2.sharedColumn
ORDER By t1.sharedColumn ASC
After that, you can fetch the result by following lines:
$result['SharedColumn'];
$result['Height_1'];
$result['Height_2'];
$result['Weight_1'];
$result['Weight_1'];

Select multiple (non-aggregate function) columns with GROUP BY

I am trying to select the max value from one column, while grouping by another non-unique id column which has multiple duplicate values. The original database looks something like:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65789 | 15 | b | 8m
65789 | 1 | c | 1o
65790 | 10 | a | 7n
65790 | 26 | b | 8m
65790 | 5 | c | 1o
...
This works just fine using:
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.mukey;
Which returns a table like:
mukey | ComponentPercent
65789 | 20
65790 | 26
65791 | 50
65792 | 90
I want to be able to add other columns in without affecting the GROUP BY function, to include columns like name and type into the output table like:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65790 | 26 | b | 8m
65791 | 50 | c | 7n
65792 | 90 | d | 7n
but it always outputs an error saying I need to use an aggregate function with select statement. How should I go about doing this?
You have yourself a greatest-n-per-group problem. This is one of the possible solutions:
select c.mukey, c.comppct_r, c.name, c.type
from c yt
inner join(
select c.mukey, max(c.comppct_r) comppct_r
from c
group by c.mukey
) ss on c.mukey = ss.mukey and c.comppct_r= ss.comppct_r
Another possible approach, same output:
select c1.*
from c c1
left outer join c c2
on (c1.mukey = c2.mukey and c1.comppct_r < c2.comppct_r)
where c2.mukey is null;
There's a comprehensive and explanatory answer on the topic here: SQL Select only rows with Max Value on a Column
Any non-aggregate column should be there in Group By clause .. why??
t1
x1 y1 z1
1 2 5
2 2 7
Now you are trying to write a query like:
select x1,y1,max(z1) from t1 group by y1;
Now this query will result only one row, but what should be the value of x1?? This is basically an undefined behaviour. To overcome this, SQL will error out this query.
Now, coming to the point, you can either chose aggregate function for x1 or you can add x1 to group by. Note that this all depends on your requirement.
If you want all rows with aggregation on z1 grouping by y1, you may use SubQ approach.
Select x1,y1,(select max(z1) from t1 where tt.y1=y1 group by y1)
from t1 tt;
This will produce a result like:
t1
x1 y1 max(z1)
1 2 7
2 2 7
Try using a virtual table as follows:
SELECT vt.*,c.name FROM(
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.muke;
) as VT, c
WHERE VT.mukey = c.mukey
You can't just add additional columns without adding them to the GROUP BY or applying an aggregate function. The reason for that is, that the values of a column can be different inside one group. For example, you could have two rows:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65789 | 20 | b | 9f
How should the aggregated group look like for the columns name and type?
If name and type is always the same inside a group, just add it to the GROUP BY clause:
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.muke, c.name, c.type;
Use a 'Having' clause
SELECT *
FROM c
GROUP BY c.mukey
HAVING c.comppct_r = Max(c.comppct_r);

Finding the difference between two sets of data from the same table

My data looks like:
run | line | checksum | group
-----------------------------
1 | 3 | 123 | 1
1 | 7 | 123 | 1
1 | 4 | 123 | 2
1 | 5 | 124 | 2
2 | 3 | 123 | 1
2 | 7 | 123 | 1
2 | 4 | 124 | 2
2 | 4 | 124 | 2
and I need a query that returns me the new entries in run 2
run | line | checksum | group
-----------------------------
2 | 4 | 124 | 2
2 | 4 | 124 | 2
I tried several things, but I never got to a satisfying answer.
In this case I'm using H2, but of course I'm interested in a general explanation that would help me to wrap my head around the concept.
EDIT:
OK, it's my first post here so please forgive if I didn't state the question precisely enough.
Basically given two run values (r1, r2, with r2 > r1) I want to determine which rows having row = r2 have a different line, checksum or group from any row where row = r1.
select * from yourtable
where run = 2 and checksum = (select max(checksum)
from yourtable)
Assuming your last run will have the higher run value than others, below SQL will help
select * from table1 t1
where t1.run in
(select max(t2.run) table1 t2)
Update:
Above SQL may not give you the right rows because your requirement is not so clear. But the overall idea is to fetch the rows based on the latest run parameters.
SELECT line, checksum, group
FROM TableX
WHERE run = 2
EXCEPT
SELECT line, checksum, group
FROM TableX
WHERE run = 1
or (with slightly different result):
SELECT *
FROM TableX x
WHERE run = 2
AND NOT EXISTS
( SELECT *
FROM TableX x2
WHERE run = 1
AND x2.line = x.line
AND x2.checksum = x.checksum
AND x2.group = x.group
)
A slightly different approach:
select min(run) run, line, checksum, group
from mytable
where run in (1,2)
group by line, checksum, group
having count(*)=1 and min(run)=2
Incidentally, I assume that the "group" column in your table isn't actually called group - this is a reserved word in SQL and would need to be enclosed in double quotes (or backticks or square brackets, depending on which RDBMS you are using).