Having SQL Server choose and show one record over other - sql

Ok, hopefully I can explain this accurately. I work in SQL Server, and I am trying to get one row from a table that will show multiple rows for the same person for various reasons.
There is a column called college_attend which will show either New or Cont for each student.
My issue: my initial query narrows down the rows I'm pulling by Academic Year, which consists of two semesters: Fall of one year, and Spring of the following to create an academic year. This is why there are two rows returned for some students.
Basically, I need to generate an accurate count of those that are "New" and those that are "Cont", but I don't want both records for the same student counted. They will have two records because they will have one for spring and one for fall (usually). So if a student is "New" in fall, they will have a "Cont" record for spring. I want the query to show ONLY the "New" record if they have both a "New' and "Cont" record, and count it (which I will do in Report Builder). The other students will basically have two records that are "Cont": one for fall, and one "Cont" for spring, and so those would be considered the continuing ones or "Cont".
Here is the basic query I have so far:
SELECT DISTINCT
people.people_id,
people.last_name,
people.first_name,
academic.college_attend AS NewORCont,
academic.academic_year,
academic.academic_term,
FROM
academic
INNER JOIN
people ON people.people_id = academic.people_id
INNER JOIN
academiccalendar acc ON acc.academic_year = academic.academic_year
AND acc.academic_term = academic.academic_term
AND acc.true_academic_year = #Academic_year
I'm not sure if this can be done with a CASE statement? I thought of a GROUP BY, but then SQL Server will want me to add all of my columns to the GROUP BY clause, and that ends up negating the purpose of the grouping in the first place.
Just a sample of what I work with for each student:
People ID
Last
First
NeworCont
12345
Soanso
Guy
New
12345
Soanso
Guy
Cont
32345
Person
Nancy
Cont
32345
Person
Nancy
Cont
55555
Smith
John
New
55555
Smith
John
Cont
---------
------
-------
----------
Hopefully this sheds some light on the duplicate record issue I mentioned.

Without sample data its awkward to visualize the problem, and without the expected results specified it's also unclear what you want as the outcome. Perhaps this will assist, it will limit the results to only those who have both 'New' and 'Cont' in a single "true academic year" but the count seems redundant as this (I imagine) will always be 2 (being 1 New term and 1 Cont term)
SELECT
people.people_id
, people.last_name
, people.first_name
, acc.true_academic_year
, count(*) AS count_of
FROM academic
INNER JOIN people ON people.people_id = academic.people_id
INNER JOIN academiccalendar acc ON acc.academic_year = academic.academic_year
AND acc.academic_term = academic.academic_term
AND acc.true_academic_year = #Academic_year
GROUP BY
people.people_id
, people.last_name
, people.first_name
, acc.true_academic_year
HAVING MAX(academic.college_attend) = 'New'
AND MIN(academic.college_attend) = 'Cont'

Related

Sql Left or Right Join One To Many Pagination

I have one main table and join other tables via left outer or right outer outer join.One row of main table have over 30 row in join query as result. And I try pagination. But the problem is I can not know how many rows will it return for one main table row result.
Example :
Main table first row result is in my query 40 rows.
Main table second row result is 120 row.
Problem(Question) UPDATE:
For pagination I need give the pagesize the count of select result. But I can not know the right count for my select result. Example I give page no 1 and pagesize 50, because of this I cant get the right result.I need give the right pagesize for my main table top 10 result. Maybe for top 10 row will the result row count 200 but my page size is 50 this is the problem.
I am using Sql 2014. I need it for my ASP.NET project but is not important.
Sample UPDATE :
it is like searching an hotel for booking. Your main table is hotel table. And the another things are (mediatable)images, (mediatable)videos, (placetable)location and maybe (commenttable)comments they are more than one rows and have one to many relationship for the hotel. For one hotel the result will like 100, 50 or 10 rows for this all info. And I am trying to paginate this hotels result. I need get always 20 or 30 or 50 hotels for performance in my project.
Sample Query UPDATE :
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Hotel Table :
HotelId HotelName
1 Barcelona
2 Berlin
Media Table :
MediaID MediaUrl HotelId
1 www.xxx.com 1
2 www.xxx.com 1
3 www.xxx.com 1
4 www.xxx.com 1
Location Table :
LocationId Adress HotelId
1 xyz, Berlin 1
2 xyz, Nice 1
3 xyz, Sevilla 1
4 xyz, Barcelona 1
Comment Table :
CommentId Comment HotelId
1 you are cool 1
2 you are great 1
3 you are bad 1
4 hmm you are okey 1
This is only sample! I have 9999999 hotels in my database. Imagine a hotel maybe it has 100 images maybe zero. I can not know this. And I need get 20 hotels in my result(pagination). But 20 hotels means 1000 rows maybe or 100 rows.
First, your query is poorly written for readability flow / relationship of tables. I have updated and indented to try and show how/where tables related in hierarchical relativity.
You also want to paginate, lets get back to that. Are you intending to show every record as a possible item, or did you intend to show a "parent" level set of data... Ex so you have only one instance per Media, Per User, or whatever, then once that entry is selected you would show details for that one entity? if so, I would do a query of DISTINCT at the top-level, or at least grab the few columns with a count(*) of child records it has to show at the next level.
Also, mixing inner, left and right joins can be confusing. Typically a right-join means you want the records from the right-table of the join. Could this be rewritten to have all required tables to the left, and non-required being left-join TO the secondary table?
Clarification of all these relationships would definitely help along with the context you are trying to get out of the pagination. I'll check for comments, but if lengthy, I would edit your original post question with additional details vs a long comment.
Here is my SOMEWHAT clarified query rewritten to what I THINK the relationships are within your database. Notice my indentations showing where table A -> B -> C -> D for readability. All of these are (INNER) JOINs indicating they all must have a match between all respective tables. If some things are NOT always there, they would be changed to LEFT JOINs
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Readability of a query is a BIG help for yourself, and/or anyone assisting or following you. By not having the "on" clauses near the corresponding joins can be very confusing to follow.
Also, which is your PRIMARY table where the rest are lookup reference tables.
ADDITION PER COMMENT
Ok, so I updated a query which appears to have no context to the sample data and what you want in your post. That said, I would start with a list of hotels only and a count(*) of things per hotel so you can give SOME indication of how much stuff you have in detail. Something like
select
H.HotelID,
H.HotelName,
coalesce( MedSum.recs, 0 ) as MediaItems,
coalesce( LocSum.recs, 0 ) as NumberOfLocations,
coalesce( ComSum.recs, 0 ) as NumberOfLocations
from
Hotel H
LEFT JOIN
( select M.HotelID,
count(*) recs
from Media M
group by M.HotelID ) MedSum
on H.HotelID = MedSum.HotelID
LEFT JOIN
( select L.HotelID,
count(*) recs
from Location L
group by L.HotelID ) LocSum
on H.HotelID = LocSum.HotelID
LEFT JOIN
( select C.HotelID,
count(*) recs
from Comment C
group by C.HotelID ) ComSum
on H.HotelID = ComSum.HotelID
order by
H.HotelName
--- apply any limit per pagination
Now this will return every hotel at a top-level and the total count of things per the hotel per the individual counts which may or not exist hence each sub-check is a LEFT-JOIN. Expose a page of 20 different hotels. Now, as soon as one person picks a single hotel, you can then drill-into the locations, media and comments per that one hotel.
Now, although this COULD work, having to do these counts on an every-time query might get very time consuming. You might want to add counter columns to your main hotel table representing such counts as being performed here. Then, via some nightly process, you could re-update the counts ONCE to get them primed across all history, then update counts only for those hotels that have new activity since entered the date prior. Not like you are going to have 1,000,000 posts of new images, new locations, new comments in a day, but of 22,000, then those are the only hotel records you would re-update counts for. Each incremental cycle would be short based on only the newest entries added. For the web, having some pre-aggregate counts, sums, etc is a big time saver where practical.

SQL query for crystal reports produces duplicate results

I created 3 tables TstInvoice, TstProd, TstPersons and added some data:
INVOICE_NBR CLIENT_NR VK_CONTACT
A10304 003145 AT
A10305 000079 EA
A10306 004458 AT
A10307 003331 JDJ
PROD_NR INVOICE_NBR
P29366 A10304
P29367 A10304
P29368 A10305
P29369 A10306
P29370 A10306
P29371 A10307
PERS_NR INITIALEN STATUS PERSOON
0001 AT 7 Alice Thompson
0002 EA 1 Edgar Allen
0003 JDJ 1 John Doe Joe
0004 AT 1 Arthur Twins
The parameter that is passed to the crystal report is the INVOICE_NBR.
On my crystal report I put some fields from the databases and one sql expression:
(
SELECT "TstPersons"."PERSOON" FROM "TstPersons"
WHERE "TstPersons"."INITIALEN" = "TstInvoice"."VK_CONTACT" AND "TstPersons"."STATUS" = 1
)
The full query that is generated:
SELECT "TstInvoice"."INVOICE_NBR", "TstInvoice"."CLIENT_NR", "TstPersons"."STATUS", "TstPersons"."PERSOON", "TstProd"."PROD_NR", "TstProd"."INVOICE_NBR", (
SELECT "TstPersons"."PERSOON" FROM "TstPersons"
WHERE "TstPersons"."INITIALEN" = "TstInvoice"."VK_CONTACT" AND "TstPersons"."STATUS" = 1
)
FROM ("GCCTEST"."dbo"."TstInvoice" "TstInvoice" INNER JOIN "GCCTEST"."dbo"."TstProd" "TstProd" ON "TstInvoice"."INVOICE_NBR"="TstProd"."INVOICE_NBR") INNER JOIN "GCCTEST"."dbo"."TstPersons" "TstPersons" ON "TstInvoice"."VK_CONTACT"="TstPersons"."INITIALEN"
WHERE "TstInvoice"."INVOICE_NBR"='A10304'
The result is as shown in the screenshot:
As you can see the TstPersons.PERSOON field is populated with Alice Thompson and the sql expression field is correctly populated with Arthur Twins. However, I would like only to see the prod_nr once. With this query it produces the prod numbers twice because of the double entry for "AT" despite the fact that I ask for only status 1. I could just delete the old entry but I want to know if it's possible this way.
* edit * I added the status = 1 to the "record selection formula editor" and that seems to work. Not need the sql expression field at all. Not sure if this is the correct way to go though.
So now it looks like this:
SELECT "TstInvoice"."INVOICE_NBR", "TstInvoice"."CLIENT_NR", "TstPersons"."STATUS", "TstPersons"."PERSOON", "TstProd"."PROD_NR", "TstProd"."INVOICE_NBR"
FROM ("GCCTEST"."dbo"."TstInvoice" "TstInvoice" INNER JOIN "GCCTEST"."dbo"."TstProd" "TstProd" ON "TstInvoice"."INVOICE_NBR"="TstProd"."INVOICE_NBR") INNER JOIN "GCCTEST"."dbo"."TstPersons" "TstPersons" ON "TstInvoice"."VK_CONTACT"="TstPersons"."INITIALEN"
WHERE "TstInvoice"."INVOICE_NBR"='A10304' AND "TstPersons"."STATUS"=1
You have a very weak join in your query due to the duplicate values found in the INITIALEN column. Using the STATUS = 1 criteria is a work-around more than a solution because if you ever need to report on an invoice where the contact has a status other than 1, you will need to modify the report's design to allow your join to work because the STATUS value is not found on the invoice to allow a proper join to occur.
You are also running a risk of this work-around breaking down completely should you have another contact with both the same initials and status values as another.
The correct way to solve this problem would be to join TstInvoice to TstPersons through a field that has unique values. The PERS_NR column appears to be a good choice for this.
This is also going to require a redesign of the TstInvoice table to include the PERS_NR column as a Foreign Key.
A stronger join between invoices and persons would also remove the need for that sub-query in you selection statement. This would simplify your query down to the following:
SELECT "TstInvoice"."INVOICE_NBR", "TstInvoice"."CLIENT_NR", "TstPersons"."STATUS", "TstPersons"."PERSOON", "TstProd"."PROD_NR", "TstProd"."INVOICE_NBR"
FROM "GCCTEST"."dbo"."TstInvoice" "TstInvoice"
INNER JOIN "GCCTEST"."dbo"."TstProd" "TstProd"
ON "TstInvoice"."INVOICE_NBR"="TstProd"."INVOICE_NBR"
INNER JOIN "GCCTEST"."dbo"."TstPersons" "TstPersons"
ON "TstInvoice"."PERS_NR"="TstPersons"."PERS_NR"
WHERE "TstInvoice"."INVOICE_NBR"='A10304'

Defaulting missing data

I have a complex set of schema that I am trying to pull data out of for a report. The query for it joins a bunch of tables together and I am specifically looking to pull a subset of data where everything for it might be null. The original relations for the tables look as such.
Location.DeptFK
Dept.PK
Section.DeptFK
Subsection.SectionFK
Question.SubsectionFK
Answer.QuestionFK, SubmissionFK
Submission.PK, LocationFK
From here my problems begin to compound a little.
SELECT Section.StepNumber + '-' + Question.QuestionNumber AS QuestionNumberVar,
Question.Question,
Subsection.Name AS Subsection,
Section.Name AS Section,
SUM(CASE WHEN (Answer.Answer = 0) THEN 1 ELSE 0 END) AS NA,
SUM(CASE WHEN (Answer.Answer = 1) THEN 1 ELSE 0 END) AS AnsNo,
SUM(CASE WHEN (Answer.Answer = 2) THEN 1 ELSE 0 END) AS AnsYes,
(select count(distinct Location.Abbreviation) from Department inner join Plant on location.DepartmentFK = Department.PK WHERE(Department.Name = 'insertParameter'))
as total
FROM Department inner join
section on Department.PK = section.DepartmentFK inner JOIN
subsection on Subsection.SectionFK = Section.PK INNER JOIN
question on Question.SubsectionFK = Subsection.PK INNER JOIN
Answer on Answer.QuestionFK = question.PK inner JOIN
Submission on Submission.PK = Answer.SubmissionFK inner join
Location on Location.DepartmentFK = Department.PK AND Location.pk = Submission.PlantFK
WHERE (Department.Name = 'InsertParameter') AND (Submission.MonthTested = '1/1/2017')
GROUP BY Question.Question, QuestionNumberVar, Subsection.Name, Section.Name, Section.StepNumber
ORDER BY QuestionNumberVar;
There are 15 total locations, with this query I get 12. If I remove a relation in the join for Location I get 15 total locations but my answer data gets multiplied by 15. My issue is that not all locations are required to test at the same time so their answers should default to NA, They don't get records placed in the DB so the relationship between Location/Submission is absent.
I have a workaround almost in place via the select count distinct but, The second part is a query for finding what each location answered instead of a sum which brings the problem right back around. It also has to be dynamic because the input parameters for a department won't bring a static number of locations back each time.
I am still learning my SQL so any additional material to look at for building this query would also be appreciated. So I guess the big question here is, How would I go about creating default data in this query for anytime the Location/Submission relation has a null value?
Edit: Dummy Data
QuestionNumberVar | Section | Subsection | Question | AnsYes | AnsNo | NA (expected)
1-1.1 Math Algebra Did you do your homework? 10 1 1(4)
1-1.2 Math Algebra Did your dog eat it? 9 3 0(3)
2-1.1 English Greek Did you do your homework? 8 0 4(7)
I have tried making left joins at various applicable portions of the code to no avail. All attempts at left joins have ended with no effect on info output. This query feeds into the Dataset for an SSRS report. There are a couple workarounds for this particular section via an expression to take total Locations and subtract AnsYes and AnsNo to get the true NA value but as explained above doesn't help with my next query.
Edit: SQL Server 2012 for those who asked
Edit: my attempt at an isnull() on the missing data returns nothing I suspect because the query already eliminates the "null/missing" data. Left joining while doing this has also failed. The point of failure is on Submissions. if we bind it to Locations there are locations missing but if we don't bind it there are multiplied duplicates because Department has a One-To-Many with Location and not vice versa. I am unable to make any schema changes to improve this process.
There is a previous report that I am trying to emulate/update. It used C# logic to process data and run multiple queries to attain the same data. I don't have this luxury. (previous report exports to excel directly instead of SSRS). Here is the previous logic used.
select PK from Department where Name = 'InsertParameter';
select PK from Submission where LocationFK = 'Location.PK_var' and MonthTested = '1/1/2017'
Then it runs those into a loop where it processes nulls into NA using C# logic
EDIT (Mediocre Solution): I ended up doing the workaround of making a calculated field that subtracts Yes and No from the total # of Locations that have that Dept. This is a mediocre solution because I didn't solve my original problem and made 3 datasets that should have been displayed as a singular dataset. One for question info, one for each locations answer and one for locations that didnt participate. If a true answer comes up I will check its validity but for now, Problem psuedo solved.

SQL Query Removing Dups

Using GoldMine CRM and wrote this query to try and find out people in our system based on a record Type, Lead Creation Date and Last Contact Date. Now when running the query it brings up multiple results say for John Smith it is showing me all of his information rather than just the last contacted date which I want. I'll paste the query below any help would be much appreciated.
SELECT DISTINCT CONTACT1.CONTACT AS Contact_Name,
Min(CONTACT1.CONTACT),
CONTACT1.KEY1 AS Rec_Type,
CONTACT1.CREATEON AS Lead_Type,
CONTHIST.LASTDATE AS Last_Contact,
CONTHIST.RESULTCODE AS Stamp
FROM CONTACT1
INNER JOIN CONTHIST
ON CONTACT1.ACCOUNTNO = CONTHIST.ACCOUNTNO
WHERE CONTACT1.KEY1 = 'Lead'
GROUP BY CONTACT1.CONTACT,
CONTACT1.KEY1,
CONTACT1.CREATEON,
CONTHIST.LASTDATE,
CONTHIST.RESULTCODE
HAVING Count(CONTACT1.CONTACT) = 1
ORDER BY CONTACT1.CREATEON

SQL: Need to remove duplicate rows in query containing multiple joins

Note that I'm a complete SQL noob and in the process of learning. Based on Google searches (including searching here) I've tried using SELECT DISTINCT and GROUP BY but neither works, likely due to all of my joins (if anyone knows why they won't work exactly, that would be helpful to learn).
I need data from a variety of tables and below is the only way I know to do it (I just know the basics). The query below works fine but shows duplicates. I need to know how to remove those. The only hint I have right now is perhaps a nested SELECT query but based on research I'm not sure how to implement them. Any help at all would be great, thanks!
USE SQL_Contest
go
SELECT
CLT.Description AS ClockType,
CLK.SerialNumber AS JobClockSerial,
SIT.SiteNumber AS JobID,
SIT.[Name] AS JobsiteName,
SIT.Status AS SiteActivityStatus,
DHA.IssuedDate AS DHAIssuedDate, -- Date the clock was assigned to THAT jobsite
CLK.CreatedDate AS CLKCreatedDate, -- Date clock first was assigned to ANY jobsite
SES.ClockVoltage
FROM tb_Clock CLK
INNER JOIN tb_ClockType CLT
ON CLK.TypeID = CLT.ClockTypeID
INNER JOIN tb_DeviceHolderActivity DHA
ON CLK.ClockGUID = DHA.DeviceGUID
INNER JOIN tb_Site SIT
ON SIT.SiteGUID = DHA.HolderGUID
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
WHERE DHA.ReturnedDate IS NULL
ORDER BY SIT.[Name] ASC
EDIT: I will be reviewing these answers shortly, thank you very much. I'm posting the additional duplicate info per Rob's request:
Everything displays fine until I add:
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
Which I need. That's when a duplicate appears:
JobClock 2,500248E4,08-107,Brentwood Job,1,2007-05-04 13:36:54.000,2007-05-04 13:47:55.407,3049
JobClock 2,500248E4,08-107,Brentwood Job,1,2007-05-04 13:36:54.000,2007-05-04 13:47:55.407,3049
I want that info to only display once. Essentially this query is to determine all active jobsites that have a clock assigned to them, and that job only has one clock assigned to it, and it's only one jobsite, but it's appearing twice.
EDIT 2: Based on the help you guys provided I was able to determine they actually are NOT duplicates, and each session is independent, that is the only one that happened to have two sessions. So now I'm going to try to figure out how to only pull in information from the latest session.
If everything "works fine" until you add:
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
Then there must be more than one record in tb_Session for each CLK.SerialNumber.
Run the following query:
SELECT *
FROM tb_Session SES
WHERE ClockSerialNumber = '08-107'
There should be two records returned. You need to decide how to handle this (i.e. Which record do you want to use?), unless both rows from tb_Session contain identical data, in which case, should they?
You could always change your query to:
SELECT
CLT.Description AS ClockType,
CLK.SerialNumber AS JobClockSerial,
SIT.SiteNumber AS JobID,
SIT.[Name] AS JobsiteName,
SIT.Status AS SiteActivityStatus,
DHA.IssuedDate AS DHAIssuedDate, -- Date the clock was assigned to THAT jobsite
CLK.CreatedDate AS CLKCreatedDate, -- Date clock first was assigned to ANY jobsite
SES.ClockVoltage
FROM tb_Clock CLK
INNER JOIN tb_ClockType CLT
ON CLK.TypeID = CLT.ClockTypeID
INNER JOIN tb_DeviceHolderActivity DHA
ON CLK.ClockGUID = DHA.DeviceGUID
INNER JOIN tb_Site SIT
ON SIT.SiteGUID = DHA.HolderGUID
LEFT JOIN
(
SELECT DISTINCT ClockSerialNumber, ClockVoltage
FROM tb_Session
) SES
ON SES.ClockSerialNumber = CLK.SerialNumber
WHERE DHA.ReturnedDate IS NULL
ORDER BY SIT.[Name] ASC
As that should ensure that SES only contains one record for each unique combination of ClockSerialNumber and ClockVoltage
Take this example dataset:
Ingredient
IngredientId IngredientName
============ =========
1 Apple
2 Orange
3 Pear
4 Tomato
Recipe
RecipeId RecipeName
======== ==========
1 Apple Turnover
2 Apple Pie
3 Poached Pears
Recipe_Ingredient
RecipeId IngredientId Quantity
======== ============ ========
1 1 0.25
1 1 1.00
2 1 2.00
3 3 1.00
Note: Why the Apple Turnover has two lots of apple as ingredients, is neither here nor there, it just does.
The following query will return two rows for the "Apple Turnover" recipe, one row for the "Apple Pie" recipe and one row for the "Poached Pears" recipe, because there are two entries in the Recipe_Ingredient table for IngredientId 1. That's just what happens with a join..
SELECT I.IngredientName,
R.RecipeName
FROM Ingredient I
JOIN Recipe_Ingredient RI
ON I.IngredientId = RI.IngredientId
JOIN Recipe R
ON RI.recipeId = R.RecipeId
You could get this to return only one row by changing it to:
SELECT I.IngredientName,
R.RecipeName
FROM Ingredient I
JOIN Recipe_Ingredient RI
ON I.IngredientId = RI.IngredientId
JOIN Recipe R
ON RI.recipeId = R.RecipeId
GROUP BY I.IngredientName, R.RecipeName
Without more specifics regarding your data, it's hard to apply this to your specific scenario, but the walkthrough may help you understand where the "duplicates" are coming from as someone unfamiliar with SQL
The joins are not your problem. From your comments I will infer that what you are calling "duplicates" are not actual duplicates. If all columns values for 2 "duplicates" returned from the query matched, then either SELECT DISTINCT or GROUP BY would definitely eliminate them. So you should be able to find a solution by looking at your column definitions.
My best guess is that you're getting duplicates of for the same date which aren't really duplicates because the time component of the date doesn't match. To eliminate this problem, you can truncate the date fields to the date only using this technique:
DATEADD(DAY, DATEDIFF(DAY, 0, DHA.IssuedDate), 0) AS DHAIssuedDate,
DATEADD(DAY, DATEDIFF(DAY, 0, CLK.CreatedDate), 0) AS CLKCreatedDate,
If that doesn't work you might want to take a look at JobClockSerial: does this column belong in the query results?