relationships produce duplicate records on query, DISTINCT does not work, any other solutions available? - sql

I am new to this portal. I have a very simple problem to be solved. It is related to the ANSI SQL. I am writing a reports using BIRT and I am fetching the data from several tables. I understand how the SQL joins work but maybe not fully. I researched google for hours and I could not find relevant answer.
My problem is that one of the relationships in the code produce a duplicate result (the same row is copied - duplicated). I was so determined to solve it I used every type of join available. Some of this SQL was produced already. I shall post my code below. I know that one of the solutions to my problem is use of the 'DISTINCT' keyword. I have used it and it does not solve my problem.
Can anyone propose any solution to that?
Sample code:
SELECT DISTINCT
partmaster.partdesc,
partmaster.uom,
traders.name AS tradername,
worksorders.id AS worksorderno,
worksorders.partid,
worksorders.quantity,
worksorders.duedate,
worksorders.traderid,
worksorders.orderid,
routingoperations.partid,
routingoperations.methodid,
routingoperations.operationnumber,
routingoperations.workcentreid,
routingoperations.settime,
routingoperations.runtime,
routingoperations.perquantity,
routingoperations.description,
routingoperations.alternativeoperation,
routingoperations.alternativeoperationpreference,
machines.macdesc,
machines.msection,
allpartmaster.partnum,
allpartmaster.nbq,
allpartmaster.partdesc,
routingoperationtools.toolid,
tools.tooldesc,
CAST (emediadetails.data as VARCHAR(MAX)) AS cplandata
FROM worksorders
INNER JOIN partmaster ON worksorders.partid = partmaster.partnum
INNER JOIN traders traders ON worksorders.traderid = traders.id
INNER JOIN routingoperations routingoperations ON worksorders.partid = routingoperations.partid
AND worksorders.routingmethod = routingoperations.methodid
INNER JOIN allpartmaster allpartmaster ON routingoperations.partid = allpartmaster.partnum
LEFT OUTER JOIN machines machines ON routingoperations.workcentreid = machines.macid
LEFT OUTER JOIN routingoperationtools routingoperationtools ON routingoperationtools.partid = routingoperations.partid
AND routingoperationtools.routingmethod = routingoperations.methodid
AND routingoperationtools.operationnumber = routingoperations.operationnumber
LEFT OUTER JOIN tools tools ON tools.toolid = routingoperationtools.toolid
LEFT OUTER JOIN emediadetails ON emediadetails.keyvalue1 = worksorders.id
AND emediadetails.keyvalue2 = routingoperations.operationnumber
AND emediadetails.emediaid = 'worksorderoperation'
I do not have too much of the test data but I know that one row is copied twice as the result of the query below even tho I used DISTINCT keyword. I know that my problem is rather specific and not general but the solution that someone will propose may help others with the similar problem.

I can't solve your problem for you without some test data, but I have some helpful hints.
In principle, you should be really careful with DISTINCT - its a great way of hiding bugs in your query. Only use DISTINCT if you are confident that the underlying data contains legitimate duplicates. If your joins are wrong, and you're getting a cartesian product, you can remove the duplicates from the results with DISTINCT - but that doesn't stop the cartesian product being generated. You'll get very poor performance, and possibly incorrect data.
Secondly, I am pretty sure that DISTINCT works properly - you are almost certainly not getting duplicates, but it may be hard to spot the difference between two rows. Leading or trailing spaces in text columns, for instance could be to blame.
Finally, to work through this problem, I'd recommend building the query up join by join, and seeing where you get the duplicate - that's the join that's to blame.
So, start with:
SELECT
traders.name AS tradername,
worksorders.id AS worksorderno,
worksorders.partid,
worksorders.quantity,
worksorders.duedate,
worksorders.traderid,
worksorders.orderid
FROM worksorders
INNER JOIN traders traders ON
worksorders.traderid = traders.id
and build up to the next join.

Are you sure the results are exact duplicates? Makes sure there isn't one column that actually has a different value.

Related

Problem with SQL FULL OUTER JOIN with WHERE date

this query return result of all row with P500 & S500, except that the prj_end_dt won't work at all it simply ignore it is there a way to make it work?
thank you,
SELECT project_labour.cod_no,project.prj_no,project.prj_end_dt,project_labour.pla_hrs_budg
FROM project_labour
FULL OUTER JOIN project ON project_labour.prj_no=project.prj_no
WHERE project_labour.cod_no='S500' OR project_labour.cod_no='P500' AND prj_end_dt<'2019-01-01'
FULL JOIN is almost never necessary. I rarely use it, and I write lots of queries. Filtering is even more troublesome. I am guessing that you really want:
SELECT pl.cod_no, p.prj_no, p.prj_end_dt, pl.pla_hrs_budg
FROM project p LEFT JOIN
project_labour pl
ON pl.prj_no = p.prj_no AND pl.cod_no IN ('S500', 'P500')
WHERE p.prj_end_dt < '2019-01-01';
This will return all projects from prior to 2019. Any matching project_labour rows will be returned. If there are none, then those columns will be NULL. It is quite possible that an INNER JOIN is sufficient for the query; the IN condition fixes the problem with your logic.

Converting subquery to join assistance

I am my subquery is severely slowing my full query down in MySQL. I'm in the process of converting the original query to work on MySQL as I'm moving away from SQL Server where it has worked wonderfully. MySQL on the other hand isnt too happy. Was wondering if anyone could assist in helping me with a conversion solution to a join as I'm not well versed in joins quite yet. Thanks!
select a.crm_ticket_details_detail,
crm_ticket_created_date,
crm_ticket_id,
crm_ticket_customer_id,
c.crm_assigned_user
from php_crm.crm_ticket,
php_crm.crm_ticket_details a,
php_crm.crm_assigned c
where crm_ticket_resolved_date is null
and crm_ticket_id = a.crm_ticket_details_ticket_id
and a.crm_ticket_details_type = 'issue'
and c.crm_assigned_ticket_id = crm_ticket_id
and c.crm_assigned_id = (select max(d.crm_assigned_id)
from php_crm.crm_assigned d
where d.crm_assigned_ticket_id = crm_ticket_id)
SELECT
details.crm_ticket_details_detail,
CT.crm_ticket_created_date,
CT.crm_ticket_id,
CT.crm_ticket_customer_id,
ASSIGNED.crm_assigned_user
FROM
php_crm.crm_ticket CT (NONAME)
INNER JOIN php_crm.crm_ticket_details DETAILS -- (A)
ON CT.crm_ticket_id = DETAILS.crm_ticket_details_ticket_id
INNER JOIN php_crm.crm_assigned ASSIGNED -- (C)
ON CT.crm_ticket_id = ASSIGNED.crm_assigned_ticket_id
WHERE
crm_ticket_resolved_date IS NULL
AND DETAILS.crm_ticket_details_type = 'issue'
AND
AND ASSIGNED.crm_assigned_id = (SELECT
max(d.crm_assigned_id)
FROM
php_crm.crm_assigned d
WHERE
d.crm_assigned_ticket_id = crm_ticket_id)
I believe that's what you're looking for. I can't speak to whether it will actually improve performance, although it will certainly make it easier to understand. I'm not sure the old style of joins is actually less efficient; just harder to read / easier to make product joins with.
That said, if there are other common keys between the three tables that are being indirectly neutralized in other parts of the logic then that could have a performance impact.
(EDIT: Actually not sure if this was what you're looking for, reread your question and you seem focused on the subquery... I don't see any problems jumping out with that, would need more details to address that.)

SQL Query not searching data if record contains zero

I have two tables and with column paperNo and some data regarding that paper. I am trying to search all data based on paper no. from both the tables. I have successfully written the query and it is retrieving the data successfully. but I have noticed that. If my paperNo contains zero(0) then the query is not searching for that data. And for the non zero contains paperNo it is retrieving the same record twice.
I don't understand what is going wrong. tried every thing.
Here is my Query .-
SELECT PaperDate.paperNo,
PaperDate.RAW_PAPER,
PaperDate.EDGE_SEALED,
PaperDate.HYDRO_120,
PaperDate.HYDRO_350,
PaperDate.CATALYST_1ST,
PaperDate.CATALYST_2ND,
PaperDate.SIC_350,
tblThicknessPaperDate.rawThickness,
tblThicknessPaperDate.catThickness,
tblThicknessPaperDate.sicThickness,
tblThicknessPaperDate.rejectedThickness
FROM tblThicknessPaperDate
FULL OUTER JOIN PaperDate ON PaperDate.paperNo =tblThicknessPaperDate.paperNo
WHERE (tblThicknessPaperDate.paperNo = #paperNo)
I would try:
FROM tblThicknessPaperDate
RIGHT JOIN PaperDate ON PaperDate.paperNo =tblThicknessPaperDate.paperNo
WHERE (PaperDate.paperNo = #paperNo)
The two changes are: swapping to a right join so even if a record isn't in tblThicknessPaperDate we will still see the record in PaperDate. The other change is to use PapterDate.paperNo in the where clause. Since tblThicknessPaperDate.paperNo could be null we don't want to use that in the where if we can avoid it.
SELECT PaperDate.paperNo,
PaperDate.RAW_PAPER,
PaperDate.EDGE_SEALED,
PaperDate.HYDRO_120,
PaperDate.HYDRO_350,
PaperDate.CATALYST_1ST,
PaperDate.CATALYST_2ND,
PaperDate.SIC_350,
tblThicknessPaperDate.rawThickness,
tblThicknessPaperDate.catThickness,
tblThicknessPaperDate.sicThickness,
tblThicknessPaperDate.rejectedThickness
FROM tblThicknessPaperDate
FULL OUTER JOIN PaperDate ON PaperDate.paperNo =tblThicknessPaperDate.paperNo
WHERE (tblThicknessPaperDate.paperNo = #papNo | PaperDate.paperNo = #paperNo)

Produce a Query viewing multiple tables

I have been given a database, the structure and data values are all unchangable and have been provided with a question.
Produce a query to list the holiday code, holiday description, holiday duration and site description for all holidays which visit site code 101. Your answer must not assume that site code 101 will always have the same site description.
I am confused on how to tackle this question. I have tried Multiple joins, different dot notation and googled the question to hell and back. Any help?
Table 1 - Holiday_Details
Holiday_Code - Country_Visited - Holiday_Duration - Holiday_Desc - Rating_Code - Cost
Table 2 - Site_Of_Holiday
Site_Description - Site_Code
Table 3 - Site_Visited
Holiday_Code - Site_Code
Comments have asked for previous attempts. This was my first.
SELECT holiday_code,
holiday_desc,
holiday_duration site_of_holiday.Site_Name
FROM holiday_details
JOIN site_visited
ON holiday_code = site_visited.holiday_code
JOIN site_of_holiday
ON site_visited.site_code = site_of_holiday.site_code
WHERE site_of_holiday.site_code = 101;
For future reference, you'll get a better response if you post a lot more detail about your failed attempts. By that, I mean code. Using SO to solve your homework assignments is frowned upon but, like a commenter said, once you've wracked your brain we're willing to help.
You seem like you may have actually tried real hard, so I'll throw you a bone...
The trick to navigating multiple tables is to find the "pairs" of matching columns. In this case you want to find a path between the tables Site_Of_Holiday (which has Site_Description) and Holiday_Details (which has everything else).
The columns that match between each pair of tables are:
Holiday_Code is found in both Site_Visited and Holiday_Details
Site_Code is found in both Site_Of_Holiday and Site_Visited
This allows you to build a path between the tables that contain all of the columns we want in the output. You would do this, in this case, using INNER JOINs across those matching column pairs.
Once you've joined the tables, think of the result like a giant table whose columns include all columns from all three tables (prefixed with whatever you 'name' the table during the joins). Now you just filter on the Site_Code with the usual WHERE clause.
Here's the full example - let me know if it works for you:
SELECT hd.Holiday_Code, hd.Holiday_Desc, hd.Holiday_Duration, soh.Site_Description
FROM Holiday_Details hd
INNER JOIN Site_Visited sv ON hd.Holiday_Code = sv.Holiday_Code
INNER JOIN Site_Of_Holiday soh ON sv.Site_Code = soh.Site_Code
WHERE sv.Site_Code = 101
Good luck!
P.S. In case any Americans get a similar assignment, here's the translation ;-)
SELECT vd.Vacation_Code, vd.Vacation_Desc, vd.Vacation_Duration, sov.Site_Description
FROM Vacation_Details vd
INNER JOIN Site_Visited sv ON vd.Vacation_Code = sv.Vacation_Code
INNER JOIN Site_Of_Vacation sov ON sv.Site_Code = sov.Site_Code
WHERE sv.Site_Code = 101

Access query returns empty fields depending on how table is linked

I've got an Access MDB I use for reporting that has linked table views from SQL Server 2005. I built a query that retrieves information off of a PO table and categorizes the line item depending on information from another table. I'm relatively certain the query was fine until approximately a month ago when we shifted from compatibility mode 80 to 90 on the Server as required by our primary application (which creates the data). I can't say this with 100% certainty, but that is the only major change made in the past 90 days. We noticed that suddenly data was not showing up in the query making the reports look odd.
This is a copy of the failing query:
SELECT dbo_porel.jobnum, dbo_joboper.opcode, dbo_porel.jobseqtype,
dbo_opmaster.shortchar01,
dbo_porel.ponum, dbo_porel.poline, dbo_podetail.unitcost
FROM ((dbo_porel
LEFT JOIN dbo_joboper ON (dbo_porel.assemblyseq = dbo_joboper.assemblyseq)
AND (dbo_porel.jobseq = dbo_joboper.oprseq)
AND (dbo_porel.jobnum = dbo_joboper.jobnum))
LEFT JOIN dbo_opmaster ON dbo_joboper.opcode = dbo_opmaster.opcode)
LEFT JOIN dbo_podetail ON (dbo_porel.poline = dbo_podetail.poline)
AND (dbo_porel.ponum = dbo_podetail.ponum)
WHERE (dbo_porel.jobnum="367000003")
It returns the following:
jobnum opcode jobseqtype shortchar01 ponum poline unitcost
367000003 S 6624 2 15
The query normally should have displayed a value for opcode and shortchar01. If I remove the linked table dbo_podetail, it properly displays data for these fields (although I obviously don't have unitcost anymore). At first I thought it might be a data issue, but I found if I nested the query and then linked the table, it worked fine.
For example the following code works perfectly:
SELECT qryTest.*, dbo_podetail.unitcost
FROM (
SELECT dbo_porel.jobnum, dbo_joboper.opcode, dbo_porel.jobseqtype,
dbo_opmaster.shortchar01, dbo_porel.ponum, dbo_porel.poline
FROM (dbo_porel
LEFT JOIN dbo_joboper ON (dbo_porel.jobnum=dbo_joboper.jobnum)
AND (dbo_porel.jobseq=dbo_joboper.oprseq)
AND (dbo_porel.assemblyseq=dbo_joboper.assemblyseq))
LEFT JOIN dbo_opmaster ON dbo_joboper.opcode=dbo_opmaster.opcode
WHERE (dbo_porel.jobnum="367000003")
) As qryTest
LEFT JOIN dbo_podetail ON (qryTest.poline = dbo_podetail.poline)
AND (qryTest.ponum = dbo_podetail.ponum)
I'm at a loss for why it works in the latter case and not in the first case. Worse yet, it seems to work intermittently for some records and not for others (it's consistent about the ones it does and does not work for).
Do any of you experts have any ideas?
You definitely need to use subqueries for multiple left/right joins in Access.
I think it's a limitation of the Jet optimizer that gets confused if you're just chaining left/right joins.
You can see that this is a recurrent problem that surfaces often.
I'm always confused by Access' use of brackets in joins. Try stripping out the extra brackets.
FROM
dbo_porel
LEFT JOIN
dbo_joboper ON (dbo_porel.assemblyseq = dbo_joboper.assemblyseq)
AND (dbo_porel.jobseq = dbo_joboper.oprseq)
AND (dbo_porel.jobnum = dbo_joboper.jobnum)
LEFT JOIN
dbo_opmaster ON (dbo_joboper.opcode = dbo_opmaster.opcode)
LEFT JOIN
dbo_podetail ON (dbo_porel.poline = dbo_podetail.poline)
AND (dbo_porel.ponum = dbo_podetail.ponum)
OK the above doesn't work - Sorry I give up