SQL LEFT JOIN WHERE not displaying right result - sql

So I got this query:
Data structure:
Users
id---inlog----name----more stuff
llntoets
id---code----inlog----more stuff
oefeningen
id---speler---status----morestuff
(inlog and speler are always the same values for a user)
SELECT
// Some other stuff working
SUM(o.status) AS oefn
FROM users AS u
LEFT JOIN llntoets AS l
ON (u.inlog = l.inlog)
LEFT JOIN oefeningen AS o
ON (u.inlog = o.speler) AND o.status = 'afgewerkt'
WHERE
code = '$code'
GROUP BY l.inlog
ORDER BY klas ASC, klasnr ASC
Everything runs fine except 1 thing the oefn variable. It shows a number sometimes it shows the correct value and sometimes it shows a value that is much higher than it should be. Someone told me it could be because of the GROUP BY. Can someone help me pls?
It is supposed to count the total records from table oefeningen where status = 'afgewerkt' and where the speler is the inlog from users. Thanks, if you got other questions ask will try to explain more.

the SUM(o.status) in your query it is not supposed to count the total records of table oefeningen.
that sum is the sum of the values of all the joined rows that satisfy your criteria that can be a much higher number.
also note that applying the filter o.status = 'afgewerkt' you are performing a JOIN even if you wrote LEFT JOIN throghout the query.

Related

COUNT is outputting more than one row

I am having a problem with my SQL query using the count function.
When I don't have an inner join, it counts 55 rows. When I add the inner join into my query, it adds a lot to it. It suddenly became 102 rows.
Here is my SQL Query:
SELECT COUNT([fmsStage].[dbo].[File].[FILENUMBER])
FROM [fmsStage].[dbo].[File]
INNER JOIN [fmsStage].[dbo].[Container]
ON [fmsStage].[dbo].[File].[FILENUMBER] = [fmsStage].[dbo].[Container].[FILENUMBER]
WHERE [fmsStage].[dbo].[File].[RELATIONCODE] = 'SHIP02'
AND [fmsStage].[dbo].[Container].DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08'
GROUP BY [fmsStage].[dbo].[File].[FILENUMBER]
Also, I have to do TOP 1 at the SELECT statement because it returns 51 rows with random numbers inside of them. (They are probably not random, but I can't figure out what they are.)
What do I have to do to make it just count the rows from [fmsStage].[dbo].[file].[FILENUMBER]?
First, your query would be much clearer like this:
SELECT COUNT(f.[FILENUMBER])
FROM [fmsStage].[dbo].[File] f INNER JOIN
[fmsStage].[dbo].[Container] c
ON v.[FILENUMBER] = c.[FILENUMBER]
WHERE f.[RELATIONCODE] = 'SHIP02' AND
c.DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08';
No GROUP BY is necessary. Otherwise you'll just one row per file number, which doesn't seem as useful as the overall count.
Note: You might want COUNT(DISTINCT f.[FILENUMBER]). Your question doesn't provide enough information to make a judgement.
Just remove GROUP BY Clause
SELECT COUNT([fmsStage].[dbo].[File].[FILENUMBER])
FROM [fmsStage].[dbo].[File]
INNER JOIN [fmsStage].[dbo].[Container]
ON [fmsStage].[dbo].[File].[FILENUMBER] = [fmsStage].[dbo].[Container].[FILENUMBER]
WHERE [fmsStage].[dbo].[File].[RELATIONCODE] = 'SHIP02'
AND [fmsStage].[dbo].[Container].DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08'

left join excludes nulls and takes left value

SELECT p.ticket AS posted,
e.ticket AS settled,
Sum(e.amount)
FROM post AS p
LEFT JOIN settle AS e
ON p.ticket = e.ticket
WHERE p.date = '2016-05-10 00:00:00.000'
GROUP BY p.pticket,
e.eticket
ORDER BY posted
I understand that the grouping or where is the culprit but I've tried so many variations, the rows for the two tables are :
(Table1=Table2)
(total = (item + tax= total))
So the second table has 2 rows that I sum. I need the date because it has to much info and I've tried "is null" in dates and in other places but can't get this right. Instead of null, it shows the value of the left table as if they match.
So I figured out! Just love this site and wanted to leave input for whoever runs into this, please correct me if my information is not accurate as I would like to explain in detail but basically, from what I understand and from SQL Fundamentals, nulls are considered outer rows and therefore I would need a "Left Outer Join":
SELECT p.ticket AS posted,
e.ticket AS settled,
Sum(e.amount)
FROM post AS p
LEFT JOIN settle AS e
ON p.ticket = e.ticket
WHERE p.date = '2016-05-10 00:00:00.000'
GROUP BY p.pticket,
e.eticket
ORDER BY posted
A good rule that stood out for me and helps me check my joins with nulls is that the Where clause should only have conditions to the table where the nulls will not be included, so you keep the outer rows, with the exception of "is null". You could use that with a column on the non-preserved table and still get nulls.

SQL: return 0 count in case no record is found

A very simple issue as it appears but somehow not working for me on Oracle 10gXE.
Based on my SQLFiddle, I have to show all staff names and count if present or 0 if no record found having status = 2
How can I achieve it in a single query without calling Loop in my application side.
SELECT S.NAME,ISTATUS.STATUS,COUNT(ISTATUS.Q_ID) as TOTAL
FROM STAFF S
LEFT OUTER JOIN QUESTION_STATUS ISTATUS
ON S.ID = ISTATUS.DONE_BY
AND ISTATUS.STATUS = 2 <--- instead of WHERE
GROUP BY S.NAME,ISTATUS.STATUS
By filtering in the WHERE clause, you filter too late, and you remove STAFF rows that you do want to see. Moving the filter into the join condition means only QUESTION_STATUS rows get filtered out.
Note that STATUS is not really a useful column here, since you won't ever get any result other than 2 or NULL, so you could omit it:
SELECT S.NAME,COUNT(ISTATUS.Q_ID) as TOTAL
FROM STAFF S
LEFT OUTER JOIN QUESTION_STATUS ISTATUS
ON S.ID = ISTATUS.DONE_BY
AND ISTATUS.STATUS = 2
GROUP BY S.NAME
I corrected your sqlfiddle: http://sqlfiddle.com/#!4/90ba0/12
The rule of thumb is that the filters must appear in the ON condition of the table they depend on.
Move filter into the LEFT JOIN , also use COALESCE to have your results display 0 instead of null as you requested in your question
select S.NAME,COALESCE(ISTATUS.STATUS,0),COUNT(ISTATUS.Q_ID) as TOTAL
from STAFF S
LEFT OUTER JOIN QUESTION_STATUS ISTATUS
ON S.ID = ISTATUS.DONE_BY
AND ISTATUS.STATUS =2
GROUP BY S.NAME,ISTATUS.STATUS

SQL counter difficulties

select maintitle,
firstprodyear,
COUNT(DISTINCT episode.episodeid) as TOTALEPISODES
from series
LEFT OUTER JOIN episode ON series.seriesid = episode.seriesid
LEFT OUTER JOIN filmitem ON filmitem.filmid = episode.episodeid
where firstprodyear =(select MIN(firstprodyear) from series)
group by maintitle, firstprodyear;
2/3s of the query works. I do get the title of the serie and earliest year. But it seems like the episode counter isn't working properly. For some episode I do get 15, 34 and somewhere 0.
I would preciate for some guidance to make the episodecounter work as it should. Where have I missed?
Try:
select maintitle,
min(firstprodyear) firstprodyear,
COUNT(DISTINCT episode.episodeid) as TOTALEPISODES
from series
LEFT OUTER JOIN episode ON series.seriesid = episode.seriesid
/*LEFT OUTER JOIN filmitem ON filmitem.filmid = episode.episodeid */
group by maintitle;
Note: the link to filmitem appears to be unnecessary with the data selected.
Your query is returning the series that have firstprodyear equal to the earliest firstprodyear in the your database. You can think of the sub-select statement as returning a fixed number that is then used in the query.
For example, if the earliest series in your database is Days of Our Lives (firstprodyear = 1965), then you will get back the number of episodes in those series that also started in 1965.
You may also want to be more explicit about which table actually contains the firstprodyear field, though I'm assuming it's series.

MySQL to return only last date / time record

We have a database that stores vehicle's gps position, date, time, vehicle identification, lat, long, speed, etc., every minute.
The following select pulls each vehicle position and info, but the problem is that returns the first record, and I need the last record (current position), based on date (datagps.Fecha) and time (datagps.Hora). This is the select:
SELECT configgps.Fichagps,
datacar.Ficha,
groups.Nombre,
datagps.Hora,
datagps.Fecha,
datagps.Velocidad,
datagps.Status,
datagps.Calleune,
datagps.Calletowo,
datagps.Temp,
datagps.Longitud,
datagps.Latitud,
datagps.Evento,
datagps.Direccion,
datagps.Provincia
FROM asigvehiculos
INNER JOIN datacar ON (asigvehiculos.Iddatacar = datacar.Id)
INNER JOIN configgps ON (datacar.Configgps = configgps.Id)
INNER JOIN clientdata ON (asigvehiculos.Idgroup = clientdata.group)
INNER JOIN groups ON (clientdata.group = groups.Id)
INNER JOIN datagps ON (configgps.Fichagps = datagps.Fichagps)
Group by Fichagps;
I need same result I'm getting, but instead of the older record I need the most recent
(LAST datagps.Fecha / datagps.Hora).
How can I accomplish this?
Add ORDER BY datagps.Fecha DESC, datagps.Hora DESC LIMIT 1 to your query.
I'm not sure why you are having any problems with this as Lex's answers seem good.
I would start putting ORDER BY's in your query so it puts them in an order, when it's showing the record you want as the first one in the list, then add the LIMIT.
If you want the most recent, then the following should be good enough:
ORDER BY datagps.Fecha DESC, datagps.Hora DESC
If you simply want the record that was added to the database most recently (irregardless of the date/time fields), then you could (assuming you have an auto-incremental primary key in the datagps table (I assume it's called dataID for this example)):
ORDER BY datagps.dataID DESC
If these aren't showing the data you want - then there is something missing from your example (maybe data-types aren't DATETIME fields? - if not - then maybe a CONVERT to change them from their current type before ORDERing BY would be a good idea)
EDIT:
I've seen the screenshot and I'm confused as to what the issue is still. That appears to be showing everything in order. Are you implying that there are many more than 5 records? How many are you expecting?
Do you mean: for each record returned, you want the one row from the table datagps with the latest date and time attached to the result? If so, how about this:
# To show how the query will be executed
# comment to return actual results
EXPLAIN
SELECT
configgps.Fichagps, datacar.Ficha, groups.Nombre, datagps.Hora, datagps.Fecha,
datagps.Velocidad, datagps.Status, datagps.Calleune, datagps.Calletowo,
datagps.Temp, datagps.Longitud, datagps.Latitud, datagps.Evento,
datagps.Direccion, datagps.Provincia
FROM asigvehiculos
INNER JOIN datacar ON (asigvehiculos.Iddatacar = datacar.Id)
INNER JOIN configgps ON (datacar.Configgps = configgps.Id)
INNER JOIN clientdata ON (asigvehiculos.Idgroup = clientdata.group)
INNER JOIN groups ON (clientdata.group = groups.Id)
INNER JOIN datagps ON (configgps.Fichagps = datagps.Fichagps)
########### Add this section
LEFT JOIN datagps b ON (
configgps.Fichagps = b.Fichagps
# wrong condition
#AND datagps.Hora < b.Hora
#AND datagps.Fecha < b.Fecha)
# might prevent indexes to be used
AND (datagps.Fecha < b.Fecha OR (datagps.Fecha = b.Fecha AND datagps.Hora < b.Hora))
WHERE b.Fichagps IS NULL
###########
Group by configgps.Fichagps;
Similar question here only that that one uses outer joins.
Edit (again):
The conditions are wrong so corrected it. Can you show us the output of the above EXPLAIN query so we can pinpoint where the bottle neck is?
As hurikhan77 said, it will be better if you could convert both of the the columns into a single datetime field - though I'm guessing this would not be possible for your case (since your database is already being used?)
Though if you can convert it, the condition (on the join) would become:
AND datagps.FechaHora < b.FechaHora
After that, add an index for datagps.FechaHora and the query would be fast(er).
What you probably want is getting the maximum of (Fecha,Hora) per grouped dataset? This is a little complicated to accomplish with your column types. You should combine Fecha and Hora into one column of type DATETIME. Then it's easy to just SELECT MAX(FechaHora) ... GROUP BY Fichagps.
It could have helped if you posted your table structure to understand the problem.