Inconsistent geolocation query results in SQL - sql

I wrote a simple web application that lets you mark flea market stands on a google map.
Each stand is stored in a sqlite3 database with its geolocation and other information.
This is the CREATE statement for the stands table:
CREATE TABLE stands (
id INTEGER PRIMARY_KEY,
name TEXT,
address TEXT,
u REAL,
v REAL,
);
u and v are respectively Latitude and Longitude.
Additionally I have a cities table that stores the name and geographic bounds of each city which host a stand. This is used to let users quickly navigate between cities.
CREATE TABLE cities
(name TEXT PRIMARY_KEY,
u_min REAL,
u_max REAL,
v_min REAL,
v_max REAL);
When a new stand is added, a new row is added to the cities table or if the stand is in a known city, only the bounds of the city are updated if needed.
Here are some sample stands:
592077673|Kierrätystori Rovaniemellä|Urheilukatu 1, 96100 Rovaniemi, Suomi|66.4978306921681|25.7220569153442
1321495145|Kruununhaka|Liisankatu, 00170 Helsinki, Suomi|60.1742596|24.9555782
571688977|Viikki asukastalo LAVAn edusta|Biologinkatu 5, 00790 Helsinki, Suomi|60.2342312|25.04058
563089951|Hämeentie 156|Hämeentie 156, 00560 Helsinki, Suomi|60.2130467082539|24.9785856067459
518892420|Joensuu - Ilosaari|Siltakatu 1, 80100 Joensuu, Finland|62.5990455742272|29.7706540507875
and cities:
Rovaniemi|66.4978306921681|66.4978306921681|25.7220569153442|25.7220569153442
Helsinki|60.1577049447137|60.2556221042622|24.9216988767212|25.0662129772156
Järvenpää|60.4513724|60.4513724|25.0819323000001|25.0819323000001
Joensuu|62.5990455742272|62.5990653244875|29.7706540507874|29.7706540507875
Vantaa|60.2731724937748|60.2731724937748|24.9571491285278|24.9571491285278
The issue I'm having is retrieving the the number of stands per cities.
So far I've been using the following query:
SELECT cities.name AS city,
cities.u_min,
cities.u_max,
cities.v_min,
cities.v_max,
count(stands.id) AS count
FROM cities
LEFT OUTER JOIN stands
ON ((stands.u BETWEEN cities.u_min AND cities.u_max)
AND(stands.v BETWEEN cities.v_min AND cities.v_max))
GROUP BY cities.name;
This returns:
Helsinki|60.1577049447137|60.2556221042622|24.9216988767212|25.0662129772156|9
**Joensuu|62.5990455742272|62.5990653244875|29.7706540507874|29.7706540507875|0**
Järvenpää|60.4513724|60.4513724|25.0819323000001|25.0819323000001|1
Rovaniemi|66.4978306921681|66.4978306921681|25.7220569153442|25.7220569153442|1
Vantaa|60.2731724937748|60.2731724937748|24.9571491285278|24.9571491285278|1
Which is not correct as the city named Joensuu does have 1 stand in its boundaries:
518892420|Joensuu - Ilosaari|Siltakatu 1, 80100 Joensuu, Finland|62.5990455742272|29.7706540507875
But the following query returns the expected stand:
SELECT * FROM stands where u between 62.5990455742272 and 62.5990653244875 and v between 29.7706540507874 and 29.7706540507875;
I really can't understand what is going wrong here.
Any help would be greatly appreciated.
By the way, I imported this database to Mysql and the same thing happens so I doubt this is a sqlite3 bug.

I think this has to do with floating point precision error. One of the ways to deal with such problems is to introduce a small number and add it to your boundaries to make them a little wider - that eliminates precision errors.
One approach is to widen boundaries in every query directly:
SET #e = 0.0000000000001;
SELECT cities.name AS city,
cities.u_min,
cities.u_max,
cities.v_min,
cities.v_max,
count(stands.id) AS count
FROM cities
LEFT OUTER JOIN stands
ON ((stands.u BETWEEN cities.u_min - #e AND cities.u_max + #e)
AND(stands.v BETWEEN cities.v_min - #e AND cities.v_max + #e))
GROUP BY cities.name;
Another approach is to store the widened boundaries in the cities table:
SET #e = 0.0000000000001;
UPDATE cities
SET cities.u_min = cities.u_min - #e,
cities.u_max = cities.u_max + #e,
cities.v_min = cities.v_min - #e,
cities.v_max = cities.v_max + #e;
P.S. I am not sure if the variable syntax works in SQLite, but if doesn't, just substitute all #e with 0.0000000000001.

The reason is that you use LEFT JOIN, so it includes all cities, regardless of whether there is a match in the stands table. Notice how COUNT returns 0 for that row.
Just use INNER JOIN instead of LEFT JOIN. That will tell MySQL to only return rows for which the condition in ON is met.
UPD: I misunderstood the question, so this answer is wrong. I'll post another answer.

Related

SQL query , group by only one column

i want to group this query by project only because there are two records of same project but i only want one.
But when i add group by clause it asks me to add other columns as well by which grouping does not work.
*
DECLARE #Year varchar(75) = '2018'
DECLARE #von DateTime = '1.09.2018'
DECLARE #bis DateTime = '30.09.2018'
select new_projekt ,new_geschftsartname, new_mitarbeitername, new_stundensatz
from Filterednew_projektkondition ps
left join Filterednew_fakturierungsplan fp on ps.new_projekt = fp.new_hauptprojekt1
where ps.statecodename = 'Aktiv'
and fp.new_startdatum >= #von +'00:00:00'
and fp.new_enddatum <= #bis +'23:59:59'
--and new_projekt= Filterednew_projekt.new_
--group by new_projekt
*
look at the column new_projekt . row 2 and 3 has same project, but i want it to appear only once. Due to different other columns this is not possible.
if its of interested , there is another coluim projectcondition id which is unique for both
You can't ask a database to decide arbitrarily for you, which records should be thrown away when doing a group. You have to be precise and specific
Example, here is some data about a person:
Name, AddressZipCode
John Doe, 90210
John Doe, 12345
SELECT name, addresszipcode FROM person INNER JOIN address on address.personid = person.id
There are two addresses stored for this one guy, the person data is repeated in the output!
"I don't want that. I only want to see one line for this guy, together with his address"
Which address?
That's what you have to tell the database
"Well, obviously his current address"
And how do you denote that an address is current?
"It's the one with the null enddate"
SELECT name, addresszipcode FROM person INNER JOIN address on address.personid = person.id WHERE address.enddate = null
If you still get two addresses out, there are two address records that are null - you have data that is in violation of your business data modelling principles ("a person's address history shall have at most one adddress that is current, denoted by a null end date") - fix the data
"Why can't i just group by name?"
You can, but if you do, you still have to tell the database how to accumulate the non-name data that it shows you. You want an address data out of it, it has 2 it wants to show you, you have to tell it which to discard. You could do this:
SELECT name, MAX(addresszipcode) FROM person INNER JOIN address on address.personid = person.id GROUP BY name
"But I don't want the max zipcode? That doesn't make sense"
OK, use the MIN, the SUM, the AVG, anything that makes sense. If none of these make sense, then use something that does, like the address line that has the highest end date, or the lowest end date that is a future end date. If you only want one address on show you must decide how to boil that data down to just one record - you have to write the rule for the database to follow and no question about it you have to create a rule so make it a rule that describes what you actually want
Ok, so you created a rule - you want only the rows with the minimum new_stundenstatz
DECLARE #Year varchar(75) = '2018'
DECLARE #von DateTime = '1.09.2018'
DECLARE #bis DateTime = '30.09.2018'
select new_projekt ,new_geschftsartname, new_mitarbeitername, new_stundensatz
from
(SELECT *, ROW_NUMBER() OVER(PARTITON BY new_projekt ORDER BY new_stundensatz) rown FROM Filterednew_projektkondition) ps
left join
Filterednew_fakturierungsplan fp on ps.new_projekt = fp.new_hauptprojekt1
where ps.statecodename = 'Aktiv'
and fp.new_startdatum >= #von +'00:00:00'
and fp.new_enddatum <= #bis +'23:59:59'
and ps.rown = 1
Here I've used an analytic operation to number the rows in your PS table. They're numbered in order of ascending new_stundensatz, starting with 1. The numbering restarts when the new_projekt changes, so each new_projekt will have a number 1 row.. and then we make that a condition of the where
(Helpful side note for applying this technique in future.. Ff it were the FP table we were adding a row number to, we would need to put AND fp.rown= 1 in the ON clause, not the WHERE clause, because putting it in the where would make the LEFT join behave like an INNER, hiding rows that don't have any FP matching record)

Selecting rows from Parent Table only if multiple rows in Child Table match

Im building a code that learns tic tac toe, by saving info in a database.
I have two tables, Games(ID,Winner) and Turns(ID,Turn,GameID,Place,Shape).
I want to find parent by multiple child infos.
For Example:
SELECT GameID FROM Turns WHERE
GameID IN (WHEN Turn = 1 THEN Place = 1) AND GameID IN (WHEN Turn = 2 THEN Place = 4);
Is something like this possible?
Im using ms-access.
Turm - Game turn GameID - Game ID Place - Place on matrix
1=top right, 9=bottom left Shape - X or circle
Thanks in advance
This very simple query will do the trick in a single scan, and doesn't require you to violate First Normal Form by storing multiple values in a string (shudder).
SELECT T.GameID
FROM Turns AS T
WHERE
(T.Turn = 1 AND T.Place = 1)
OR (T.Turn = 2 AND T.Place = 4)
GROUP BY T.GameID
HAVING Count(*) = 2;
There is no need to join to determine this information, as is suggested by other answers.
Please use proper database design principles in your database, and don't violate First Normal Form by storing multiple values together in a single string!
The general solution to your problem can be accomplished by using a sub-query that contains a self-join between two instances of the Turns table:
SELECT * FROM Games
WHERE GameID IN
(
SELECT Turns1.GameID
FROM Turns AS Turns1
INNER JOIN Turns AS Turns2
ON Turns1.GameID = Turns2.GameID
WHERE (
(Turns1.Turn=1 AND Turns1.Place = 1)
AND
(Turns2.Turn=2 AND Turns2.Place = 4))
);
The Self Join between Turns (aliased Turns1 and Turns2) is key, because if you just try to apply both sets of conditions at once like this:
WHERE (
(Turns.Turn=1 AND Turns.Place = 1)
AND
(Turns.Turn=2 AND Turns.Place = 4))
you will never get any rows back. This is because in your table there is no way for an individual row to satisfy both conditions at the same time.
My experience using Access is that to do a complex query like this you have to use the SQL View and type the query in on your own, rather than use the Query Designer. It may be possible to do in the Designer, but it's always been far easier for me to write the code myself.
select GameID from Games g where exists (select * from turns t where
t.gameid = g.gameId and ((turn =1 and place = 1) or (turn =2 and place =5)))
This will select all the games that have atleast one turn with the coresponding criteria.
More info on exist:
http://www.techonthenet.com/sql/exists.php
I bypassed this problem by adding a column which holds the turns as a string example : "154728" and i search for it instead. I think this solution is also less demanding on the database

How to join two queries in access

I am incapable of make a query that return me a results as follows:
TABLES: series, usuarios, siguiendo, valoraciones_personales
Each table has got this records:
example: field1(value), field2(value),...
series (I refer a tv show, I am spanish and here we say "serie=tv_show")
1. id_serie(1),id_titulo('Sons of anarchy')
2. id_serie(2),id_titulo('Lost')
usuarios (user)
1. id_usuario(1), nick('david')
siguiendo (a usser follow a series)
1. id_serie(1),id_usuario(1)
2. id_serie(2),id_usuario(1)
valoraciones_personales (personal assessments)
1. id_serie(1),id_usuario(1),nota(8)
Ok, what I want is a result with all records of the table siguiendo, and if that user valued one of that series, it must shows the score (nota in spanish), and if that user didn´t scored that series, I want to show "without score".
The view I want:
*titulo, nota*
- Sons of anarchy, 8
- Lost, without score
Can anyone help me?
Specifically in MSACCESS
Create a query called something like AllUserSeries
SELECT
U.UserID
,U.FullName
,S.SeriesID
,S.SeriesName
FROM
usuarios as U
,series as S
This is the equivalent of a cross join
Then another:
SELECT
A.FullName
,A.SeriesName
,Nz(Cstr(R.Score),"Not Rated") as Rating
FROM
AllUserSeries AS A
LEFT OUTER JOIN valoraciones_personales AS R
ON A.UserID = R.UserID
AND A.SeriesID = R.SeriesID
WHERE
A.UserID = #UserID
The tricky bit is getting a list of all the series a user may have liked. to do this normally i would do a cross join to get all permutations that could exist, then left join from there to the ratings table using Nz to handle null values as you see fit.
*sorry for kinda making up the other column names it was easier for me to use English hope that okay :D

how to get ID of field based on the Max of another field in same table

I have a table named ae_types. It contains three fields that are relevant to my question:
aetId This is Auto Increment and is the Primary Key
aetProposalType Text field 5 characters long
aetDaysToWait Byte data type
aetProposalType and aetDaysToWait are in a unique key so that I am guaranteed that there will never be two aetProposalTypes with the same aetDaysToWait.
The result that I am looking for is to get the aetId of the field with the largest aetDaysToWait for each aetProposalType.
Below is the query that I have come up with to accomplish this, but it seems to me like it is possibly unnecessarily complicated and not very beautiful.
SELECT ae_types.aetId AS lastEmailId, ae_types.aetProposalType
FROM ae_types INNER JOIN
(SELECT ae_types.aetProposalType, Max(ae_types.aetDaysToWait) AS MaxOfaetDaysToWait
FROM ae_types GROUP BY ae_types.aetProposalType) AS ae_maxDaysToWaitByProposalType
ON (ae_types.aetDaysToWait = ae_maxDaysToWaitByProposalType.MaxOfaetDaysToWait)
AND (ae_types.aetProposalType = ae_maxDaysToWaitByProposalType.aetProposalType);
What are some alternative solutions and why would they be better?
PS If you have any questions please ask and I will be happy to attempt to provide the answer.
That's the way I'd do it too.
select a.aetId, a.aetProposalType, a.aetDaysToWait
from ae_types a
inner join (select aetProposalType, max(aetDaysToWait) as MaxDays
from ae_types
group by aetProposalType) sq
on a.aetProposalType = sq.aetProposalType
and a.aetDaysToWait = sq.MaxDays

How to write a query returning non-chosen records

I have written a psychological testing application, in which the user is presented with a list of words, and s/he has to choose ten words which very much describe himself, then choose words which partially describe himself, and words which do not describe himself. The application itself works fine, but I was interested in exploring the meta-data possibilities: which words have been most frequently chosen in the first category, and which words have never been chosen in the first category. The first query was not a problem, but the second (which words have never been chosen) leaves me stumped.
The table structure is as follows:
table words: id, name
table choices: pid (person id), wid (word id), class (value between 1-6)
Presumably the answer involves a left join between words and choices, but there has to be a modifying statement - where choices.class = 1 - and this is causing me problems. Writing something like
select words.name
from words left join choices
on words.id = choices.wid
where choices.class = 1
and choices.pid = null
causes the database manager to go on a long trip to nowhere. I am using Delphi 7 and Firebird 1.5.
TIA,
No'am
Maybe this is a bit faster:
SELECT w.name
FROM words w
WHERE NOT EXISTS
(SELECT 1
FROM choices c
WHERE c.class = 1 and c.wid = w.id)
Something like that should do the trick:
SELECT name
FROM words
WHERE id NOT IN
(SELECT DISTINCT wid -- DISTINCT is actually redundant
FROM choices
WHERE class == 1)
SELECT words.name
FROM
words
LEFT JOIN choices ON words.id = choices.wid AND choices.class = 1
WHERE choices.pid IS NULL
Make sure you have an index on choices (class, wid).