SQL - Remove Duplicates in Single Field - sql

SELECT Company.CompanyName
,Student.Status
,Student.Level
,Student.PlacementYear
,Company.CompanyCode
,Company.HREmail
,Company.Telephone
,Company.HRContact
,PlacedStudents.DateAdded
FROM Student
RIGHT JOIN (Company INNER JOIN PlacedStudents
ON Company.CompanyCode = PlacedStudents.CompanyCode)
ON Student.StudentNo = PlacedStudents.StudentNo
WHERE (((Student.PlacementYear)=" & Year & "))
AND((Student.Status)<>'Still Seeking YOPE')
ORDER BY Company.CompanyName
I have this SQL Query which pulls HR Contacts from Companies where students are currently placed. However, there are multiple students at one company so when I run the query there are duplicates. I'm fairly new to SQL, I tried DISTINCT, however it didn't seem to do anything, the duplicates remained.
How can I remove duplicates in the CompanyCode field so that the Company only appears once when the query is run.
Below is an image of what happens when I run query. Hopefully this makes sense?
Any help would be appreciated.

This query should give you companies that have placed students:
SELECT Company.CompanyName
,Company.CompanyCode
,Company.HREmail
,Company.Telephone
,Company.HRContact
FROM Company
WHERE EXISTS (SELECT * FROM PlacedStudents INNER JOIN
Student ON Student.StudentNo = PlacedStudents.StudentNo
WHERE Company.CompanyCode = PlacedStudents.CompanyCode
AND Student.PlacementYear =" & Year & "
AND Student.Status <>'Still Seeking YOPE')
ORDER BY Company.CompanyName;

Your question is asking for HR Contacts from Companies where students are placed. I assume this means if you have 1, 2 or 1,000,000 students at a single company, you only want to see the company listed once?
Your current query is returning information from STUDENT and PLACEDSTUDENTS which is going to result in output like
COMPANY_A STUDENT01 .........
COMPANY_A STUDENT02 .........
COMPANY_A STUDENT03 .........
and so on.
If so, and taking a best guess (since I can't know what's in STUDENT or PLACEDSTUDENTS tables), try not including anything related to STUDENT in the SELECT.
SELECT DISTINCT Company.CompanyName, Company.CompanyCode, Company.HREmail,
Company.Telephone, Company.HRContact FROM
I'll be happy to help more if you can provide more information about the structure of the tables and some examples of data, AND what you actually want from the query.

Related

Having SQL Server choose and show one record over other

Ok, hopefully I can explain this accurately. I work in SQL Server, and I am trying to get one row from a table that will show multiple rows for the same person for various reasons.
There is a column called college_attend which will show either New or Cont for each student.
My issue: my initial query narrows down the rows I'm pulling by Academic Year, which consists of two semesters: Fall of one year, and Spring of the following to create an academic year. This is why there are two rows returned for some students.
Basically, I need to generate an accurate count of those that are "New" and those that are "Cont", but I don't want both records for the same student counted. They will have two records because they will have one for spring and one for fall (usually). So if a student is "New" in fall, they will have a "Cont" record for spring. I want the query to show ONLY the "New" record if they have both a "New' and "Cont" record, and count it (which I will do in Report Builder). The other students will basically have two records that are "Cont": one for fall, and one "Cont" for spring, and so those would be considered the continuing ones or "Cont".
Here is the basic query I have so far:
SELECT DISTINCT
people.people_id,
people.last_name,
people.first_name,
academic.college_attend AS NewORCont,
academic.academic_year,
academic.academic_term,
FROM
academic
INNER JOIN
people ON people.people_id = academic.people_id
INNER JOIN
academiccalendar acc ON acc.academic_year = academic.academic_year
AND acc.academic_term = academic.academic_term
AND acc.true_academic_year = #Academic_year
I'm not sure if this can be done with a CASE statement? I thought of a GROUP BY, but then SQL Server will want me to add all of my columns to the GROUP BY clause, and that ends up negating the purpose of the grouping in the first place.
Just a sample of what I work with for each student:
People ID
Last
First
NeworCont
12345
Soanso
Guy
New
12345
Soanso
Guy
Cont
32345
Person
Nancy
Cont
32345
Person
Nancy
Cont
55555
Smith
John
New
55555
Smith
John
Cont
---------
------
-------
----------
Hopefully this sheds some light on the duplicate record issue I mentioned.
Without sample data its awkward to visualize the problem, and without the expected results specified it's also unclear what you want as the outcome. Perhaps this will assist, it will limit the results to only those who have both 'New' and 'Cont' in a single "true academic year" but the count seems redundant as this (I imagine) will always be 2 (being 1 New term and 1 Cont term)
SELECT
people.people_id
, people.last_name
, people.first_name
, acc.true_academic_year
, count(*) AS count_of
FROM academic
INNER JOIN people ON people.people_id = academic.people_id
INNER JOIN academiccalendar acc ON acc.academic_year = academic.academic_year
AND acc.academic_term = academic.academic_term
AND acc.true_academic_year = #Academic_year
GROUP BY
people.people_id
, people.last_name
, people.first_name
, acc.true_academic_year
HAVING MAX(academic.college_attend) = 'New'
AND MIN(academic.college_attend) = 'Cont'

Use group by with sum in query

These 3 tables that you see in the image are related
Course table and coaching table and sales table
I want to make a report from this table on how much each coach has sold by each course period.
The query I created is as follows, but unfortunately it has a problem and I do not know where the problem is.
Please help me fix the problem
Thank you
SELECT
dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid, dbo.tblPost.postTitle,
dbo.tblArticleAuthor.authorName, SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM
dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblCustomersOrders.pid
For this use, SUM() is an Aggregate Function, so you need to refer all the
fields that you want to get in your result set.
Example:
SELECT
dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid, dbo.tblPost.postTitle,
dbo.tblArticleAuthor.authorName, SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblCustomersOrders.id, dbo.tblCustomersOrders.pid,
dbo.tblPost.postTitle, dbo.tblArticleAuthor.authorName
But this query does not solve the need for your report.
If you just need to get "how much each coach has sold by each course" , you can try the query bellow.
SELECT
dbo.tblArticleAuthor.authorName, dbo.tblPost.postTitle,
SUM(dbo.tblCustomersOrders.prodPrice) AS TotalBuys
FROM dbo.tblPost
INNER JOIN
dbo.tblArticleAuthor ON dbo.tblPost.id = dbo.tblArticleAuthor.articleID
INNER JOIN
dbo.tblCustomersOrders ON dbo.tblPost.id = dbo.tblCustomersOrders.pid
GROUP BY dbo.tblArticleAuthor.authorName, dbo.tblPost.postTitle
If you need, send more details regarding the desired result.
Here you can find more information about SQL SERVER Aggregate Functions:
https://learn.microsoft.com/en-us/sql/t-sql/functions/aggregate-functions-transact-sql?view=sql-server-ver15
And here a quick example regarding SQL Aliases to build queries with a simple
and effective way:
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_alias_table
Per your description of the task, the problem is that you only GROUPed BY dbo.tblCustomersOrders.pid, which is the period's id I guess, but you also need to GROUP BY the coach, which is dbo.tblArticleAuthor.authorName, I guess again. Plus in the SELECT field list you can not use more columns only that are aggregated + GROUPed.

Need help finding only Employees and Sales Persons in SQL

I am trying to run an SQL query which would fetch me all people who are
1. Only employees,
2. Employees and a sales person and
3. Only sales persons.
I am working on the Oracle E-Business Suite. So far, my query returns only those people who are employees only and those people who are employees and also a sales person. Here is what I've managed so far:
select distinct PAF.LAST_NAME,
PAF.START_DATE "HIRE_DATE",
PAF.EMPLOYEE_NUMBER,
PPT.SYSTEM_PERSON_TYPE "PERSON_TYPE",
JRS.SALES_CREDIT_TYPE_ID,
JRS.SALESREP_NUMBER
from PER_ALL_PEOPLE_F PAF,
PER_PERSON_TYPES PPT,
PER_PERSON_TYPE_USAGES_F PPTU,
JTF_RS_DEFRESOURCES_VL JRDV,
JTF_RS_SALESREPS JRS
where PAF.PERSON_ID = PPTU.PERSON_ID
and PPTU.PERSON_TYPE_ID = PPT.PERSON_TYPE_ID
and PPT.SYSTEM_PERSON_TYPE in ('EMP','OTHER')
and JRDV.category in ('EMPLOYEE','OTHER')
and (JRS.SALESREP_NUMBER(+) = PAF.EMPLOYEE_NUMBER)
and sysdate between PAF.EFFECTIVE_START_DATE and PAF.EFFECTIVE_END_DATE;
This is what I want to achieve
I have to include those people who are ONLY salespersons. Basically, there should be some rows which have no Employee_Number but only SALESREP_NUMBER. What am I doing wrong?
Pretty sure your issue lies here:
AND PAF.EMPLOYEE_NUMBER = JRS.SALESREP_NUMBER
Like the comment above says, you don't give much info. But, an educated guess would be that the equivalence indicated above would make that person an employee AND a salesperson. Maybe something more like:
AND (
PAF.EMPLOYEE_NUMBER = JRS.SALESREP_NUMBER
OR
( PAF.EMPLOYEE_NUMBER AND JRS.SALESREP_NUMBER IS NULL)
OR
( PAF.EMPLOYEE_NUMBER IS NULL AND JRS.SALESREP_NUMBER)
)
Or maybe just delete that clause?

How to make a query to obtain only results that have N number within a range of values?

I'm trying to extract nutrient data in MS Access 2007 from the USDA food database, freely available at http://www.ars.usda.gov/Services/docs.htm?docid=24912
I need records that have ALL nutrients from NUT_DATA.Nutr_No . Those records have values between '501' and '511' . But I wish to exclude incomplete records that have missing values.
Currently, Baby food banana has all from nutrient 501 to 511, but Baby food Beverage has only 9 of the nutrients listed, and many others are like that.
As a last resort, I guess it would be acceptable to have all records, showing null for missing values, as long as each FOOD_DES.Long_Desc has exactly 11 records, one for each NUT_DATA.Nutr_No OR NUTR_DEF.NutrDesc (which correspond to each other).
SELECT
FOOD_DES.NDB_No, FOOD_DES.FdGrp_Cd, FOOD_DES.Long_Desc, NUT_DATA.Nutr_No, NUTR_DEF.NutrDesc, NUT_DATA.Nutr_Val, WEIGHT.Amount, WEIGHT.Msre_Desc, WEIGHT.Gm_Wgt, [WEIGHT]![Amount] & " " & [WEIGHT]![Msre_Desc] AS msre
FROM
NUTR_DEF inner JOIN ((FOOD_DES INNER JOIN NUT_DATA ON FOOD_DES.NDB_No=NUT_DATA.NDB_No) INNER JOIN WEIGHT ON FOOD_DES.NDB_No=WEIGHT.NDB_No) ON NUTR_DEF.Nutr_No=NUT_DATA.Nutr_No
WHERE
(NUT_DATA.Nutr_No between '501' and '511' ) and ((WEIGHT.Seq)="1") and NUT_DATA.Nutr_Val > '0' and
// this part is me out of ideas trying stuff, but didn't help
EXISTS (SELECT 1
FROM
NUTR_DEF inner JOIN ((FOOD_DES INNER JOIN NUT_DATA ON FOOD_DES.NDB_No=NUT_DATA.NDB_No) INNER JOIN WEIGHT ON FOOD_DES.NDB_No=WEIGHT.NDB_No) ON NUTR_DEF.Nutr_No=NUT_DATA.Nutr_No
WHERE count FOOD_DES.Long_Desc = "11" )
//end wild of experimentation
ORDER BY FOOD_DES.Long_Desc, NUTR_DEF.SR_Order;
This is a sample of the data. I just copied the most important columns. The red is not what I'm looking for because it doesn't have all 11 nutrients. I can paste on the google doc the whole table if someone thinks that would help.
https://docs.google.com/spreadsheets/d/1FghDD59wy2PYlpsqUlYVc3Ulwvy4MMLagpBUYtvLBfI/edit?usp=sharing
As your starting point, identify which food items have values > 0 for all 11 of those nutrients. Check whether this simpler GROUP BY query shows you the correct items:
SELECT ndat.NDB_No
FROM
NUT_DATA AS ndat
INNER JOIN WEIGHT AS wt
ON ndat.NDB_No = wt.NDB_No
WHERE
ndat.Nutr_Val>0
AND ndat.Nutr_No IN('501','502','503','504','505','506','507','508','509','510','511')
AND wt.Seq='1'
GROUP BY ndat.NDB_No
HAVING Count(ndat.Nutr_No)=11;
Note you could use Val(ndat.Nutr_No) Between 501 And 511 as the Nutr_No restriction, which would give you a more concise statement. However, evaluating Val() for every row of the table means that approach would forego the performance benefit of indexed retrieval ... so that version of the query should be noticeably slower.
Save that query and create a new query which joins it to the base tables for the additional data you need from other columns. Or use it as a subquery instead of a named query if you prefer.

SUM(a*b) not working

I have a PHP page running in postgres. I have 3 tables - workorders, wo_parts and part2vendor. I am trying to multiply 2 table column row datas together, ie wo_parts has a field called qty and part2vendor has a field called cost. These 2 are joined by wo_parts.pn and part2vendor.pn. I have created a query like this:
$scoreCostQuery = "SELECT SUM(part2vendor.cost*wo_parts.qty) as total_score
FROM part2vendor
INNER JOIN wo_parts
ON (wo_parts.pn=part2vendor.pn)
WHERE workorder=$workorder";
But if I add the costs of the parts multiplied by the qauntities supplied, it adds to a different number than what the script is doing. Help....I am new to this but if someone can show me in SQL I can modify it for postgres. Thanks
Without seeing example data, there's no way for us to know why you're query totals are coming out differently that when you do the math by hand. It could be a bad join, so you are getting more/less records than you expected. It's also possible that your calculations are off. Pick an example with the smallest number of associated records & compare.
My suggestion is to add a GROUP BY to the query:
SELECT SUM(p.cost * wp.qty) as total_score
FROM part2vendor p
JOIN wo_parts wp ON wp.pn = p.pn
WHERE workorder = $workorder
GROUP BY workorder
FYI: MySQL was designed to allow flexibility in the GROUP BY, while no other db I've used does - it's a source of numerous questions on SO "why does this work in MySQL when it doesn't work on db x...".
To Check that your Quantities are correct:
SELECT wp.qty,
p.cost
FROM WO_PARTS wp
JOIN PART2VENDOR p ON p.pn = wp.pn
WHERE p.workorder = $workorder
Check that the numbers are correct for a given order.
You could try a sub-query instead.
(Note, I don't have a Postgres installation to test this on so consider this more like pseudo code than a working example... It does work in MySQL tho)
SELECT
SUM(p.`score`) AS 'total_score'
FROM part2vendor AS p2v
INNER JOIN (
SELECT pn, cost * qty AS `score`
FROM wo_parts
) AS p
ON p.pn = p2v.pn
WHERE p2n.workorder=$workorder"
In the question, you say the cost column is in part2vendor, but in the query you reference wo_parts.cost. If the wo_parts table has its own cost column, that's the source of the problem.