SQL remove and filter similar results

SQL remove and filter similar results - sql

I have the below query which gives me a list of orders where an address comes up twice so we can double pack same orders to the same address
select * from Orders
where
address in (select address from orders group by address having count(*) = 2)
CustID StockID Address Company
-----------------------------------------------------------------
1217 23185 1 Some Road Stockton
58458 23185 1 Some Road
58459 23185 4 John St
58457 23185 4 John St
299576 23185 9 Roadway PDE Graceland
59470 23185 9 Roadway PDE Cahill Tow
97656 23185 24 Kent St
97677 23185 24 Kent St
212732 23185 23 Best Rd
226583 23185 23 Best Rd c/o John
191718 23185 98 King St
156363 23185 98 King St
121106 23185 19 Broadway
156362 23185 19 Broadway
I want the result to look like this which excludes any addresses which come up which have a company name in either of the 2 results that come up for it. Some addresses have nothing in the Company name however i want to exclude them as well if the other result for the same address contains a company name.
CustID StockID Address Company
-----------------------------------------------------------------
58459 23185 4 John St
58457 23185 4 John St
97656 23185 24 Kent St
97677 23185 24 Kent St
191718 23185 98 King St
156363 23185 98 King St
121106 23185 19 Broadway
156362 23185 19 Broadway
Hope this all makes sense and appreciate any help
Thank you!

select * from Orders o1
where
0 = (select count(*) from orders o2
where o1.address = o2.address
and company is not null)
This is a correlated sub-query which ties the main query and subquery together over address. I have assumed "empty" company is NULL, replace with <> "" if you don't use nulls.

AS #ClockWork-Muse mentions in the comments, there's a lot of things you need to consider in this scenario, pertaining to data cleanliness as well as business logic. Having said that, you can try this for your issue:
;with filtered as
(
select * from Orders
where
address in (select address from orders group by address having count(*) = 2)
)
,cte as
(select address, max(company) as max
from filtered
group by address
having max(company) = '')
select f.* from filtered f
inner join cte c on f.address = c.address
Basically, you create a CTE to identify those companies which have only blank values for Company, and then join back to the results of your query

Related

SQL Combine duplicate rows while concatenating one column

I have a table (example) of orders show below. The orders are coming in with multiple rows that are duplicated for all columns except for the product name. We want to combine the product name into a comma delimited string with double quotes. I would like to create a select query to return the output format shown below.
INPUT
Name address city zip product name
-----------------------------------------------------------------
John Smith 123 e Test Drive Phoenix 85045 Eureka Copper Canyon, LX 4-Person Tent
John Smith 123 e Test Drive Phoenix 85045 The North Face Sequoia 4 Tent with Footprint
Tom Test 567 n desert lane Tempe 86081 Cannondale Trail 5 Bike - 2021
OUTPUT
Name address city zip product name
------------------------------------------------------------------
John Smith 123 e Test Drive Phoenix 85045 "Eureka Copper Canyon, LX 4-Person Tent", "The
North Face Sequoia 4 Tent with Footprint"
Tom Test 567 n desert lane Tempe 86081 Cannondale Trail 5 Bike - 2021

You can have List_AGG() OR GROUP_CONCAT and then join the results back to original table. Then you can remove duplicates using row_number which will create a same rank
if data is same
WITH ALL_DATA AS (
SELECT * FROM TABLE
),
LIST_OF_ITEMS_PER_PRODUCT AS (
SELECT
ALL_DATA.NAME,
LIST_AGG(ALL_DATA.PRODUCT_NAME , ",") AS ALL_PRODUCTS_PER_PERSON
-- IF YOUR SQL DON'T SUPPORT LIST_AGG() THEN USE GROUP_CONCAT INSTEAD
FROM
ALL_DATA
GROUP BY 1
),
LIST_ADDED AS (
SELECT
ALL_DATA.*,
LIST_OF_ITEMS_PER_PRODUCT.ALL_PRODUCTS_PER_PERSON
FROM
ALL_DATA
LEFT JOIN LIST_OF_ITEMS_PER_PRODUCT
ON ALL_DATA.NAME = LIST_OF_ITEMS_PER_PRODUCT.NAME
),
ADDING_ROW_NUMBER AS (
SELECT
* ,
ROW_NUMBER() over (partition by list_added.NAME, ADDRESS, CITY, ZIP ORDER BY NAME) AS ROW_NUMBER_
FROM LIST_ADDED
)
SELECT
* FROM
ADDING_ROW_NUMBER
WHERE ROW_NUMBER_ = 1

SQL to return one row from many

I have the following data
PersonId
City
Type
UpdateDate
123
Boston
P
01/01/2021
123
Boston
M
02/01/2021
130
Detroit
P
01/01/2021
130
Detroit
M
03/01/2021
140
Dallas
M
02/01/2021
140
Dallas
M
03/01/2021
I want a query that returns one row per PersonId. If the Type is "P" return that row otherwise return the row with the minimum UpdateDate. So the query would return:
PersonId
City
Type
UpdateDate
123
Boston
P
01/01/2021
130
Detroit
P
01/01/2021
140
Dallas
M
02/01/2021
In the past I would write a query like
select * from person, address
where person.PersonId = address.PersonId
group by PersonId
having (Type = 'P') or (UpdateDate = min(UpdateDate))
but this is not allowed anymore.
What should my SQL query be in SQL Server?

Presumably you want the most recent address per person. If so, outer apply is very well suited to this problem:
select p.* a.*
from person p outer apply
(select top (1) a.*
from address a
where a.PersonId = p.PersonId
order by (case when a.type = 'P' then 1 else 2 end),
a.updatedate desc
) a;
No aggregation is called for.

Populating A Many-To-Many Table

I have two tables, The Instructor table, and the Department Table. The Instructor can be involved in many departments and the departments can contain many Instructors. I'm trying to populate the DepartmentInstructor table to create a many-to-many relationship. The tables are populated like so,
Department Table
DepartmentID DepartmentName
1 Aaron Copland School of Music
2 American Studies
3 Art
4 Classical, Middle Eastern, and Asian Languages and Cultures
5 Comparative Literature
6 Drama, Theatre & Dance
7 English
8 European Languages and Literatures
Instructor Table
InstructorID InstructorFullName
1 Abrams, Brian
2 Ciavarella, Peter
3 Franklin, Arnold
4 Shur, Mitchell
5 Reich, Toby
6 Meyers, Allison
7 Dana, Kathryn
8 Rhindress, Mindy
What I'm trying to do is,
DepartmentInstructor Table
DepartmentID InstructorID
1 3
3 7
2 7
6 4
Edit:
Responding to #GeorgeJoseph, We were also given a table that contains all of the data besides the IDs. This table is shown below,
Table X
Semester Sec Code Course(HR,CRD) Description Day Time Instructor Location Enrolled Limit ModeOfInstruction
Spring 2019 02 37366 ACCT 100 (3, 3) Fin & Mgr Acct T, TH 3:10 PM - 4:25 PM Milo, Michael KY 419 20 22 In-Person
Spring 2019 03 37823 ACCT 100 (3, 3) Fin & Mgr Acct M 3:10 PM - 6:00 PM Ho, Vivian HH 17 21 22 In-Person
Spring 2019 01 37365 ACCT 100 (3, 3) Fin & Mgr Acct T, TH 10:45 AM - 12:00 PM Milo, Michael KY 419 22 22 In-Person
Spring 2019 06 7351 ACCT 101 (4, 3) Int Theo & Prac Acct 1 T, TH 12:10 PM - 2:00 PM Feisullin, Anita RA 201 30 30 In-Person
Spring 2019 12 7357 ACCT 101 (4, 3) Int Theo & Prac Acct 1 SU 8:20 AM - 12:00 PM Mintz, Chana PH 204 39 55 In-Person
Spring 2019 11 7356 ACCT 101 (4, 3) Int Theo & Prac Acct 1 S 8:20 AM - 12:00 PM Chan, Joseph PH 110 54 55 In-Person
Spring 2019 10 7355 ACCT 101 (4, 3) Int Theo & Prac Acct 1 F 6:30 PM - 10:30 PM Solarsh, Eva PH 212 30 30 Hybrid
Spring 2019 09 7354 ACCT 101 (4, 3) Int Theo & Prac Acct 1 T, TH 8:50 PM - 10:30 PM Zapf, Michael PH 110 29 55 In-Person
I added the data to the Instructor Table and the Department table through this table. Let's call this table X. The DepartmentName was created by using a case statement over the Course(HR,CRD) column.
Now to answer your question, Table X should help us in forming that many-to-many relationship between the Instructor and the Department Table. I'm currently not sure how to map the relationship. What I tried doing was this,
SELECT DISTINCT [Description], Instructor
FROM Schema.X AS x
INNER JOIN [College].[Instructor] AS I
ON x.Instructor = I.InstructorFullName
This will then give me the corresponding course taught by a professor but I'm unsure of how to go from here.
Edit 2:
Here's how my DB design looks,

As George and yourself have mentioned, you are almost there. I am using SQL Server / T-SQL
In my example you have a course table, an instructor table and a department table.
The course table must have the instructorID and the departmentID as a column. This is how you bridge the gap between all the tables. It means that you have a distinct list of departments, courses (with the linking department and tutor IDs) and a distinct list of tutors. There are considerations where a course has more than one tutor (Could happen I suppose) but test out what suits your setup. Probably add a new row to courses with the same departmentId and the 2nd tutorID.
I have added some extra columns in the output.
You can also see not all departments have courses assigned to them. Lack of funding! Also note I have used left join where inner might work better depending on your situation or where clause. EG Where courseID is not null.
http://sqlfiddle.com/#!18/cf48b/1/0

Ok so, through some trial and error and thoroughly reading through the data. I've come to a solution that I believe to be correct,
INSERT INTO [College].[DepartmentInstructor]
(DepartmentInstructorID, DepartmentKey, InstructorKey)
SELECT
NEXT VALUE FOR [Project3].[SequenceObjectForDepartmentInstructorId],
DepartmentID,
InstructorID
FROM (
SELECT DISTINCT InstructorID, DepartmentID
FROM Uploadfile.CoursesSpring2019 AS CS
INNER JOIN [College].[Instructor] AS I
ON CS.Instructor = I.InstructorFullName
INNER JOIN [College].[Department] AS D
ON CS.[Course (hr, crd)] LIKE CONCAT('%', D.DepartmentName, '%')
) AS Result
I've been able to progress further in my project and I'm about 95% done. I've actually stumbled onto a somewhat similar problem. If you refer back to the database design that I posted, the courses table will need the IDs from multiple tables. This is what I've come up with,
SELECT DISTINCT
TS.TimeSlotID,
I.InstructorID,
BL.BuildingLocationID,
C.CourseID
FROM Uploadfile.CoursesSpring2019 AS CS
INNER JOIN [College].[TimeSlot] AS TS
ON CS.[Time] = TS.[ClassHours]
INNER JOIN [College].[Instructor] AS I
ON CS.[Instructor] = I.[InstructorFullName]
INNER JOIN [College].[BuildingLocation] AS BL
ON CS.[Location] LIKE CONCAT( BL.[BuildingName], '%')
INNER JOIN [College].[Course] AS C
ON CS.[Course (hr, crd)] LIKE CONCAT(C.CourseName, '%')
The problem here is that this query results in approximately 1mil rows. Table X has approximately 4700 rows which means that this query that I currently have is nowhere near the number of rows I should have since.

Include a column to count records with a specific value

I want to return all data in a table and append a column that counts the number of records in a subset (say, the number of houses in a neighborhood).
I tried
CASE
WHEN EXISTS (SELECT 1 as [parcels]
FROM dbo.parcels p2
WHERE p2.Neighborhood = p.Neighborhood)
THEN COUNT([parcels]) END -- can't count outside subquery
as [TotalProps]
The subquery itself returns a value of 1 for each property record in any given neighborhood, but I can't count/sum the [parcels] outside of the subquery in a THEN statement.
Input Table:
dbo.parcels
ID Address Neighborhood
== ======= ============
1 123 Main St MITO
2 124 Main St MITO
3 200 2nd St MITO
4 201 2nd St MITO
5 5 Park Ave FAIRWIND
6 1600 Baker St GALLERY
7 1601 Baker St GALLERY
8 1602 Baker St GALLERY
SELECT *, <<<COUNT(neighborhood props)>>> as [TotalProps]
FROM dbo.parcels p
Expected Output:
ID Address Neighborhood TotalProps
== ======= ============ ==========
1 123 Main St MITO 4
2 124 Main St MITO 4
3 200 2nd St MITO 4
4 201 2nd St MITO 4
5 5 Park Ave FAIRWIND 1
6 1600 Baker St GALLERY 3
7 1601 Baker St GALLERY 3
8 1602 Baker St GALLERY 3

You can use COUNT OVER PARTITION aggregate:
SELECT
p.*,
COUNT(ID) OVER(PARTITION BY Neighborhood) AS TotalProps
FROM dbo.parcels p

Use window functions:
select p.*, count(*) over (partition by neighborhood)
from dbo.parcels p;

Keeping things simple - a basic subselect will give you what you need ...
SELECT
p.*,
(
select count(*)
FROM dbo.parcels p2
WHERE p2.neighborhood = p1.neighborhood ) AS hoodcount
FROM dbo.parcels p

query to find more than one name with different values

this is my table i need more than two names will appear as out put i used count in my query, but name timur has diff company so it cant count as 1 i need count as 2
Name ID Company Name CompanyID Role Name
Ahmed 73 King & Spalding 55 Counsel
Timur 78 Chance CIS Ltd 39 Partner
Timur 78 Clifford LLP 28 Counsel
Rahail 80 Reed Smith ltd 97 Partner
out put like this
Name ID Company Name CompanyID Role Name count
Ahmed 73 King & Spalding 55 Counsel 1
Timur 78 Chance CIS Ltd 39 Partner 2
Timur 78 Clifford LLP 28 Counsel 2
Rahail 80 Reed Smith ltd 97 Partner 1

I am assuming that name and ID match each other. So in case of duplicated names for different people, I am using ID for partitioning
SELECT
*,
count(*) over (partition by ID) as [count]
FROM yourtable

Use correlated sub-query:
select t.*, (select count(*) from tablename where name = t.name) as count
from tablename t

If you're using SQL Server 2005 or above then you can use a window function to achieve this easily:
SELECT
T.Name,
T.ID,
T.CompanyName,
T.CompanyID,
T.RoleName,
COUNT(*) OVER (PARTITION BY T.Name)
FROM
My_Table T

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL remove and filter similar results - sql

select * from Orders o1 where 0 = (select count(*) from orders o2 where o1.address = o2.address and company is not null) This is a correlated sub-query which ties the main query and subquery together over address. I have assumed "empty" company is NULL, replace with <> "" if you don't use nulls.

Related

SQL Combine duplicate rows while concatenating one column

SQL to return one row from many

Populating A Many-To-Many Table

Include a column to count records with a specific value

query to find more than one name with different values

Categories

Resources