sql is duplicating my results - sql

I believe the problem is within my joins but i am unable to correct it. The SQL should return 3 rows however it is duplicating and returning 12 rows instead. Any help would be much appreciated!
SELECT J.JOURNEY_NUMBER,
L.DESCRIPTION,
L.USE_CODE,
J.REAL_START_DATE,
J.REAL_END_DATE,
S.STOP_ID,
SN.WRIN_ID,
J.JOURNEY_ID
FROM PDA_STG.JOURNEY J,
PDA_STG.RESTAURANT R,
PDA_STG.LOCATION L,
PDA_STG.SERIAL_NUMBER SN,
PDA_STG.STOP S
WHERE J.JOURNEY_ID = R.JOURNEY_ID
AND l.loc_id = r.rest_loc_id
AND J.JOURNEY_ID = S.JOURNEY_ID
AND S.STOP_ID = SN.STOP_ID
AND SN.WRIN_ID = '00768669'
AND j.dc_loc_id = '994'
AND J.JOURNEY_ID = '357020'
AND J.PLANNED_START_DATE < '20-APR-17'
ORDER BY J.JOURNEY_ID DESC

You are probably joining records that you don't want to join for which you'd have to add some join criteria. (For instance if the serial number could change for a stop, i.e. you keep old serial numbers with a date, you'd only want the latest serial number, not all.)
In order to find the flaw in your query you can select * and see what records you are actually selecting.

Thanks for the feedback, i just done as India.Rocket said and it worked perfectly.
without sample it's difficult to tell what's wrong with the query. But
if rows are exact duplicate then just put a distinct after select.
That should do the job – India.Rocket 46 mins ago

Related

Query does execute without errors, but returns no output

My query here has a sub-query in it but it returns no output, but in reality it has to give some output because I manually checked and output exists.I have posted the query below.
select mac.mac_id,mac.mac1,mac.mac_type,record.soc_id
from mso_charter.mac
join record on mac.record_id = record.record_id
where mac.mac_type='ethB' and record.soc_id IN (select soc from d);
Sample data is below
mac_id mac1 mac_type record_id--- for table mac
1 6142 ethA 1
2 6412 ethB 1
3 2313 ethC 1
record_id soc_id ---- for table record
1 Qu132
1 as432
1 342aq
soc --- for table d
a12w2
23we
qw12
mso_charter is the schema name mac,d and record is the table name.
Note that your subquery is actually still a join and can be written that way:
select mac.mac_id,mac.mac1,mac.mac_type,record.soc_id
from mso_charter.mac
join record using(record_id)
join d on record.soc_id = d.soc
where mac.mac_type='ethB';
As per the comment we still need a data set to reproduce and help.
Should be select soc_id from d instead of select select soc from d
According to your sample data, d has a column soc_id. That should be used for the comparison:
select m.mac_id, m.mac1, m.mac_type, r.soc_id
from mso_charter.mac m join
record r
on m.record_id = r.record_id
where m.mac_type = 'ethB' and
r.soc_id in (select d.soc_id from d);
It is possible that ids look the same but are not, because of international characters, hidden characters, spaces in the wrong place, and so on.
If this doesn't work then try the following:
Remove the soc_id condition and see if any rows match the first condition and join.
If that still returns nothing, remove the entire where clause to see if anything matches the join.
None of your record.soc_id match any of your d.soc_id. So you get no row.
Also, you write select soc from d. soc, not soc_id. Typo or error?
So, thanks to all who tried helping me in this situation. I actually had did a very silly mistake.The reason I am posting the right answer is probably because if someone else in future get stuck in such a issue or something similar it would be helpful to them.
select m.mac_id,m.mac,m.mac_type,r.soc_id
from mso_charter.mac m
join mso_charter.record r on m.record_id = r.record_id
where m.mac_type = 'ethB' and r.soc_id IN (select d.soc_id from d);
Mistake was I had not mentioned the schema name while performing join and there were multiple tables named record in other schema's, it was just out of frustration we tend to forget small things which costed me few hours to work over.

how to join multiple tables without showing repeated data?

I pop into a problem recently, and Im sure its because of how I Join them.
this is my code:
select LP_Pending_Info.Service_Order,
LP_Pending_Info.Pending_Days,
LP_Pending_Info.Service_Type,
LP_Pending_Info.ASC_Code,
LP_Pending_Info.Model,
LP_Pending_Info.IN_OUT_WTY,
LP_Part_Codes.PartCode,
LP_PS_Codes.PS,
LP_Confirmation_Codes.SO_NO,
LP_Pending_Info.Engineer_Code
from LP_Pending_Info
join LP_Part_Codes
on LP_Pending_Info.Service_order = LP_Part_Codes.Service_order
join LP_PS_Codes
on LP_Pending_Info.Service_Order = LP_PS_Codes.Service_Order
join LP_Confirmation_Codes
on LP_Pending_Info.Service_Order = LP_Confirmation_Codes.Service_Order
order by LP_Pending_Info.Service_order, LP_Part_Codes.PartCode;
For every service order I have 5 part code maximum.
If the service order have only one value it show the result correctly but when it have more than one Part code the problem begin.
for example: this service order"4182134076" has only 2 part code, first'GH81-13601A' and second 'GH96-09938A' so it should show the data 2 time but it repeat it for 8 time. what seems to be the problem?
If your records were exactly the same the distinct keyword would have solved it.
However in rows 2 and 3 which have the same Service_Order and Part_Code if you check the SO_NO you see it is different - that is why distinct won't work here - the rows are not identical.
I say you have some problem in one of the conditions in your joins. The different data is in the SO_NO column so check the raw data in the LP_Confirmation_Codes table for that Service_Order:
select * from LP_Confirmation_Codes where Service_Order = 4182134076
I assume you are missing an and with the value from the LP_Part_Codes or LP_PS_Codes (but can't be sure without seeing those tables and data myself).
By this sentence If the service order have only one value it show the result correctly but when it have more than one Part code the problem begin. - probably you are missing and and with the LP_Part_Codes table
Based on your output result, here are the following data that caused multiple output.
Service Order: 4182134076 has :
2 PartCode which are GH81-13601A and GH96-09938A
2 PS which are U and P
2 SO_NO which are 1.00024e+09 and 1.00022e+09
Therefore 2^3 returns 8 rows. I believe that you need to check where you should join your tables.
Use DINTINCT
select distinct LP_Pending_Info.Service_Order,LP_Pending_Info.Pending_Days,
LP_Pending_Info.Service_Type,LP_Pending_Info.ASC_Code,LP_Pending_Info.Model,
LP_Pending_Info.IN_OUT_WTY, LP_Part_Codes.PartCode,LP_PS_Codes.PS,
LP_Confirmation_Codes.SO_NO,LP_Pending_Info.Engineer_Code
from LP_Pending_Info
join LP_Part_Codes on LP_Pending_Info.Service_order = LP_Part_Codes.Service_order
join LP_PS_Codes on LP_Part_Codes.Service_Order = LP_PS_Codes.Service_Order
join LP_Confirmation_Codes on LP_PS_Codes.Service_Order = LP_Confirmation_Codes.Service_Order
order by LP_Pending_Info.Service_order, LP_Part_Codes.PartCode;
distinct will not return duplicates based on your select. So if a row is same, it will only return once.

SQL LEFT JOIN WHERE not displaying right result

So I got this query:
Data structure:
Users
id---inlog----name----more stuff
llntoets
id---code----inlog----more stuff
oefeningen
id---speler---status----morestuff
(inlog and speler are always the same values for a user)
SELECT
// Some other stuff working
SUM(o.status) AS oefn
FROM users AS u
LEFT JOIN llntoets AS l
ON (u.inlog = l.inlog)
LEFT JOIN oefeningen AS o
ON (u.inlog = o.speler) AND o.status = 'afgewerkt'
WHERE
code = '$code'
GROUP BY l.inlog
ORDER BY klas ASC, klasnr ASC
Everything runs fine except 1 thing the oefn variable. It shows a number sometimes it shows the correct value and sometimes it shows a value that is much higher than it should be. Someone told me it could be because of the GROUP BY. Can someone help me pls?
It is supposed to count the total records from table oefeningen where status = 'afgewerkt' and where the speler is the inlog from users. Thanks, if you got other questions ask will try to explain more.
the SUM(o.status) in your query it is not supposed to count the total records of table oefeningen.
that sum is the sum of the values of all the joined rows that satisfy your criteria that can be a much higher number.
also note that applying the filter o.status = 'afgewerkt' you are performing a JOIN even if you wrote LEFT JOIN throghout the query.

How to do multiple join / group by selects using sqlite3?

I have a sqlite3 database with one table called orig:
CREATE TABLE orig (sdate date, stime integer, orbnum integer);
What I want to do is select the first date/time for each orbnum. The only problem is that stime holds the time as a very awkward integer.
Assuming a six-digit number, the first two digits show the hour, the 3./4. show the minutes, and the last two digits show the seconds. So a value of 12345 is 1:23:45, whereas a value of 123456 is 12:34:56.
I figured I'd do this using two nested join/group statements, but somehow I cannot get it to work properly. Here's what I've got so far:
select s.orbnum, s.sdate, s.stime
from (
select t.orbnum, t.sdate, t.stime, min(t.sdate) as minsdate
from (
select orbnum, sdate, stime, min(stime) as minstime
from scia group by orbnum, sdate
) as t inner join orig as s on s.stime = t.minstime and s.sdate = t.sdate and s.orbnum = t.orbnum
) as d inner join scia as s on s.stime = d.stime and s.sdate = minsdate and s.orbnum = d.orbnum
where s.sdate >= '2002-08-01' limit 0,200;
This is the error I get:
Error: no such column: t.orbnum
I'm sure it's just some stupid mistake, but actually, I'm quite new to SQL ...
Any help is greatly appreciated :)
Edit:
After fixing the obvious typo, the query runs -- but returns an empty result set. However, the table holds ~10yrs of data, with about 12 orbnums per day and about 4-5 different times per orbnum. So I guess there's some mistake in the logic of the query ...
In your last join, you have d, which is the result of your double nested select, and you join s on it. From there, t is not visible. That’s why you get the “no such column: t.orbnum” error. Maybe you meant s.orbnum = d.orbnum?

MySQL to return only last date / time record

We have a database that stores vehicle's gps position, date, time, vehicle identification, lat, long, speed, etc., every minute.
The following select pulls each vehicle position and info, but the problem is that returns the first record, and I need the last record (current position), based on date (datagps.Fecha) and time (datagps.Hora). This is the select:
SELECT configgps.Fichagps,
datacar.Ficha,
groups.Nombre,
datagps.Hora,
datagps.Fecha,
datagps.Velocidad,
datagps.Status,
datagps.Calleune,
datagps.Calletowo,
datagps.Temp,
datagps.Longitud,
datagps.Latitud,
datagps.Evento,
datagps.Direccion,
datagps.Provincia
FROM asigvehiculos
INNER JOIN datacar ON (asigvehiculos.Iddatacar = datacar.Id)
INNER JOIN configgps ON (datacar.Configgps = configgps.Id)
INNER JOIN clientdata ON (asigvehiculos.Idgroup = clientdata.group)
INNER JOIN groups ON (clientdata.group = groups.Id)
INNER JOIN datagps ON (configgps.Fichagps = datagps.Fichagps)
Group by Fichagps;
I need same result I'm getting, but instead of the older record I need the most recent
(LAST datagps.Fecha / datagps.Hora).
How can I accomplish this?
Add ORDER BY datagps.Fecha DESC, datagps.Hora DESC LIMIT 1 to your query.
I'm not sure why you are having any problems with this as Lex's answers seem good.
I would start putting ORDER BY's in your query so it puts them in an order, when it's showing the record you want as the first one in the list, then add the LIMIT.
If you want the most recent, then the following should be good enough:
ORDER BY datagps.Fecha DESC, datagps.Hora DESC
If you simply want the record that was added to the database most recently (irregardless of the date/time fields), then you could (assuming you have an auto-incremental primary key in the datagps table (I assume it's called dataID for this example)):
ORDER BY datagps.dataID DESC
If these aren't showing the data you want - then there is something missing from your example (maybe data-types aren't DATETIME fields? - if not - then maybe a CONVERT to change them from their current type before ORDERing BY would be a good idea)
EDIT:
I've seen the screenshot and I'm confused as to what the issue is still. That appears to be showing everything in order. Are you implying that there are many more than 5 records? How many are you expecting?
Do you mean: for each record returned, you want the one row from the table datagps with the latest date and time attached to the result? If so, how about this:
# To show how the query will be executed
# comment to return actual results
EXPLAIN
SELECT
configgps.Fichagps, datacar.Ficha, groups.Nombre, datagps.Hora, datagps.Fecha,
datagps.Velocidad, datagps.Status, datagps.Calleune, datagps.Calletowo,
datagps.Temp, datagps.Longitud, datagps.Latitud, datagps.Evento,
datagps.Direccion, datagps.Provincia
FROM asigvehiculos
INNER JOIN datacar ON (asigvehiculos.Iddatacar = datacar.Id)
INNER JOIN configgps ON (datacar.Configgps = configgps.Id)
INNER JOIN clientdata ON (asigvehiculos.Idgroup = clientdata.group)
INNER JOIN groups ON (clientdata.group = groups.Id)
INNER JOIN datagps ON (configgps.Fichagps = datagps.Fichagps)
########### Add this section
LEFT JOIN datagps b ON (
configgps.Fichagps = b.Fichagps
# wrong condition
#AND datagps.Hora < b.Hora
#AND datagps.Fecha < b.Fecha)
# might prevent indexes to be used
AND (datagps.Fecha < b.Fecha OR (datagps.Fecha = b.Fecha AND datagps.Hora < b.Hora))
WHERE b.Fichagps IS NULL
###########
Group by configgps.Fichagps;
Similar question here only that that one uses outer joins.
Edit (again):
The conditions are wrong so corrected it. Can you show us the output of the above EXPLAIN query so we can pinpoint where the bottle neck is?
As hurikhan77 said, it will be better if you could convert both of the the columns into a single datetime field - though I'm guessing this would not be possible for your case (since your database is already being used?)
Though if you can convert it, the condition (on the join) would become:
AND datagps.FechaHora < b.FechaHora
After that, add an index for datagps.FechaHora and the query would be fast(er).
What you probably want is getting the maximum of (Fecha,Hora) per grouped dataset? This is a little complicated to accomplish with your column types. You should combine Fecha and Hora into one column of type DATETIME. Then it's easy to just SELECT MAX(FechaHora) ... GROUP BY Fichagps.
It could have helped if you posted your table structure to understand the problem.