SQL Join Multiple Tables with One to Many Relationships without "Duplication"

SQL Join Multiple Tables with One to Many Relationships without "Duplication" - sql

First let me start by saying that I do understand that these are not duplicate rows. I understand the basic functionality of joining multiple tables. I am just trying to find out if there is a way to do what I am trying to do in SQL and I don't know a better way to title it.
Example Tables:
Day Table
Day_KEY Day_Label
1 Mon
2 Tues
3 Wed
4 Thur
EstHours Table
EstHours_KEY Day_KEY Est_Hours
1 1 2
2 1 1
3 1 3
ActHours Table
ActHours_KEY Day_KEY Act_Hours
1 1 3
2 1 2
3 1 2
Example Query:
select *
from Day
join EstHours on EstHours.Day_KEY = Day.Day_KEY
join ActHours on ActHours.Day_KEY = Day.Day_KEY
Result:
Day_KEY Day_Label EstHours_KEY Day_KEY Est_Hours ActHours_KEY Day_KEY Act_Hours
1 Mon 1 1 2 1 1 3
1 Mon 1 1 2 2 1 2
1 Mon 1 1 2 3 1 2
1 Mon 2 1 1 1 1 3
1 Mon 2 1 1 2 1 2
1 Mon 2 1 1 3 1 2
1 Mon 3 1 3 1 1 3
1 Mon 3 1 3 2 1 2
1 Mon 3 1 3 3 1 2
Desired Result:
Day_KEY Day_Label EstHours_KEY Day_KEY Est_Hours ActHours_KEY Day_KEY Act_Hours
1 Mon 1 1 2 1 1 3
1 Mon 2 1 1 2 1 2
1 Mon 3 1 3 3 1 2
What I have tried:
1)
Query:
select *
from (
select *, row_number() over (partition by Day.Day_KEY order by EstHours_KEY) as rn
from Day
join EstHours on EstHours.Day_KEY = Day.Day_KEY) rt
join (
select *, row_number() over (partition by Day_KEY order by ActHours_KEY) as rn
from ActHours) on ActHours.Day_KEY = Day.Day_KEY and EstHours.rn = ActHours.rn
Result:
Day_KEY Day_Label EstHours_KEY Day_KEY Est_Hours ActHours_KEY Day_KEY Act_Hours
1 Mon 1 1 2 1 1 3
1 Mon 2 1 1 2 1 2
1 Mon 3 1 3 3 1 2
This does what I need unless the EstHours has less rows than the ActHours, in which case it will leave those rows out from ActHours.
2)
Query:
select *, null, null, null
from Day
join EstHours on EstHours.Day_KEY = Day.Day_KEY
union
select Day.*, null, null, null, ActHours.*
from Day
join ActHours on ActHours.Day_KEY = Day.Day_KEY
Result:
Day_KEY Day_Label EstHours_KEY Day_KEY Est_Hours ActHours_KEY Day_KEY Act_Hours
1 Mon 1 1 2 null null null
1 Mon 2 1 1 null null null
1 Mon 3 1 3 null null null
1 Mon null null null 1 1 3
1 Mon null null null 2 1 2
1 Mon null null null 3 1 2
This does what I want except I would prefer the values to be on the same rows, so that the maximum number of rows for a single Day_KEY would be that of the either the EstHours or ActHours, whichever has more.
Has anyone any idea of how this can be done? Am I going about this all wrong?

Sounds like you need a 'group by' clause that has a unique/distinct field belonging to the 'one' table in the one to many relationship. Such as a row id.
select * from table_a,table_b,table_c group by table_a.rowid
This will collapse the results to distinct rows from table_a, and also allow the select result to use/include aggregate functions like sum() on the fields from table_b or table_c.
In the example I used, think of every row from table_b and table_c overlapping with the unique rows of table_a that get returned.

Related

Join 3 tables in bigquery with no duplication

I want to join 3 tables:
table 1
c_id gateway_id timestamp
1 0 2019-01-05 06:53:24 UTC
2 0 2019-01-05 08:51:24 UTC
table 2
gateway_id gateway_name
0 a
1 b
table 3
date u_id
2022-08-13 1
2022-08-13 2
I join from 3 tables. Tabel 1 and 2 on gateway_id and join that two to tables 3 on c_id = u_id. Here's the query i try:
WITH
date_dict AS (
SELECT
DATE('2022-08-04') AS start_dt,
DATE_SUB(current_date, INTERVAL 1 day) AS end_dt),
mp AS(
SELECT
a.c_id,
a.gateway_id,
b.gateway_name,
SUM(a.actual_amount) total_p
FROM a
JOIN b USING(gateway_id)
WHERE
date(a.update_timestamp) between (select start_dt from date_dict) and (select end_dt from date_dict)
AND b.gateway_source_name = 'Marketplace'
GROUP BY 1,2,3)
SELECT m.c_id, p.u_id, m.gateway_id, m.gateway_name, date, m.total_p
FROM mp m
JOIN p ON m.c_id = CAST(p.u_id AS INT64)
But there are duplication in the result like this:
c_id u_id gateway_id gateway_name date_key total_p
1 1 0 a 2022-08-18 800000
1 1 0 a 2022-08-18 800000
1 1 1 b 2022-08-18 634490
1 1 1 b 2022-08-18 634490
1 1 2 c 2022-08-18 200000
1 1 2 c 2022-08-18 200000
When I adding GROUP BY in the last query there are error. I want the result like this:
c_id u_id gateway_id gateway_name date_key total_p
1 1 0 a 2022-08-18 800000
1 1 1 b 2022-08-18 634490
1 1 2 c 2022-08-18 200000
There are no duplication. Any suggestion?

Shift a column based on id and time sql server

I have a big table like below:
id date count
1 201241 1
2 201241 2
3 201241 0
1 201242 5
2 201242 3
4 201242 4
3 201243 8
4 201243 2
...
How can I shift count column based on id and date columns.
id date shifted_count
1 201241 0
2 201241 0
3 201241 0
1 201242 1
2 201242 2
4 201242 0
3 201243 0
4 201243 4
...
I had some tries but they are incorrect:
;WITH CTE AS
(
SELECT count OVER(ORDER BY id , date ASC) shcount
FROM mytable
)
UPDATE mytable SET shifted_count = (SELECT shcount from CTE )

How to use Dense Rank and automatically generate dates

I have two questions in regards to DENSE_RANK and the other based on inserting dates. Basically I have 2 leagues with 4 teams per league. Each league has a round of fixtures like so:
League 1
Week1: 1v4, 2v3 - Date: 10-June-2016
Week2: 1v3, 2v4 - Date: 17-June-2016
Week3: 1v2, 3v4 - Date: 24-June-2016
League 2
Week1: 5v8, 6v7 - Date: 10-June-2016
Week2: 5v7, 6v8 - Date: 17-June-2016
Week3: 5v6, 7v8 - Date: 24-June-2016
(They play each other home and away)
Ok so league 1 and League 2 is (LeagueID 1 and League ID 2)
Week 1 and 2 are displayed under WeekNumber column
Teams 1 -8 have their own IDs (TeamID which is then displayed as HomeTeamID and AwayTeamID)
Date goes into a column which is FixtureDate
My questions are:
1- How under WeekNumber can I set it so that the group of games mentioned, it notices them as these games belong to week 1, these week 2, these games week 3 etc.
2- How to auto generate the date so that if week 1 is played 10 June 2016, the next round of fixtures are played 7 days later, then the round after 7 days later etc.
Below is what the table looks like currently:
WeekNumber HomeTeamID AwayTeamID FixtureWeek LeagueID
1 1 4 NULL 1
1 1 3 NULL 1
1 1 2 NULL 1
1 2 3 NULL 1
1 2 4 NULL 1
1 3 4 NULL 1
1 5 8 NULL 2
1 5 7 NULL 2
1 5 6 NULL 2
1 6 7 NULL 2
1 6 8 NULL 2
1 7 8 NULL 2
Below is what it should like:
WeekNumber HomeTeamID AwayTeamID FixtureWeek LeagueID
1 1 4 10-06-2016 1
2 1 3 17-06-2016 1
3 1 2 24-06-2016 1
1 2 3 10-06-2016 1
2 2 4 17-06-2016 1
3 3 4 24-06-2016 1
1 5 8 10-06-2016 2
2 5 7 17-06-2016 2
3 5 6 24-06-2016 2
1 6 7 10-06-2016 2
2 6 8 17-06-2016 2
3 7 8 24-06-2016 2
Below is my current code which needs to be modified but I need help with this:
CREATE PROCEDURE [dbo].[Fixture_Insert]
#LeagueID INT
AS
SET NOCOUNT ON
BEGIN
INSERT INTO dbo.Fixture (WeekNumber, HomeTeamID, AwayTeamID, FixtureWeek, LeagueID)
SELECT
ROW_NUMBER() OVER (ORDER BY a.LeagueID) AS WeekNumber,
h.TeamID,
a.TeamID,
NULL AS FixtureWeek, -- Don't know what to set this to for automatic dates
h.LeagueID
FROM dbo.Team h
CROSS JOIN dbo.Team a
WHERE h.TeamID <> a.TeamID
AND h.LeagueID = a.LeagueID
END
UPDATE:
I've applied images to showcase what is happening so you can see what needs to be done to fix it (the table displayed is when I did a select* from dbo.Fixture):
The proc I excuted for the above is displayed here:

DECLARE #StartFixtureWeek date = '2016-06-10'
;WITH team AS (
SELECT *
FROM (VALUES
(1,1),(2,1),(3,1),(4,1),(5,2),(6,2),(7,2),(8,2)
) as t (teamid, leagueid)
)
, cte AS (
SELECT h.teamid AS HomeTeamID,
a.teamid AS AwayTeamID,
h.leagueid AS LeagueID
FROM team h
CROSS JOIN team a
WHERE h.teamid != a.teamid AND h.leagueid = a.leagueid
), final AS (
SELECT ROW_NUMBER() OVER (PARTITION BY c.LeagueID ORDER BY c.LeagueID, c.HomeTeamID, c.AwayTeamID) as rn,
c.HomeTeamID,
c.AwayTeamID,
c.LeagueID
FROM cte c
CROSS APPLY (
SELECT TOP 1 a.HomeTeamID, a.AwayTeamID
FROM cte a
WHERE a.LeagueID= c.LeagueID and a.AwayTeamID=c.HomeTeamID and a.HomeTeamID =c.AwayTeamID
ORDER BY a.HomeTeamID, a.LeagueID) as b
WHERE c.HomeTeamID < b.HomeTeamID
)
SELECT CASE WHEN rn > 3 THEN rn-3 ELSE rn END as WeekNumber,
HomeTeamID,
AwayTeamID,
CAST(DATEADD(week,(CASE WHEN rn > 3 THEN rn-3 ELSE rn END)-1,#StartFixtureWeek) as date) FixtureWeek,
LeagueID
FROM final
Output:
WeekNumber HomeTeamID AwayTeamID FixtureWeek LeagueID
-------------------- ----------- ----------- ----------- -----------
1 1 2 2016-06-10 1
2 1 3 2016-06-17 1
3 1 4 2016-06-24 1
1 2 3 2016-06-10 1
2 2 4 2016-06-17 1
3 3 4 2016-06-24 1
1 5 6 2016-06-10 2
2 5 7 2016-06-17 2
3 5 8 2016-06-24 2
1 6 7 2016-06-10 2
2 6 8 2016-06-17 2
3 7 8 2016-06-24 2
(12 row(s) affected)

Add a parameter to your stored procedure called #StartFixtureWeek DATETIME
Then you can use DATEADD
CREATE PROCEDURE [dbo].[Fixture_Insert]
#LeagueID INT,
#StartFixtureWeek DATETIME
AS
SET NOCOUNT ON
BEGIN
INSERT INTO dbo.Fixture (WeekNumber, HomeTeamID, AwayTeamID, FixtureWeek, LeagueID)
SELECT
ROW_NUMBER() OVER (ORDER BY a.LeagueID) AS WeekNumber,
h.TeamID,
a.TeamID,
SELECT DATEADD(day,(ROW_NUMBER() OVER (ORDER BY a.LeagueID)-1)*7,#StartFixtureWeek) AS FixtureWeek,
h.LeagueID
FROM dbo.Team h
CROSS JOIN dbo.Team a
WHERE h.TeamID <> a.TeamID
AND h.LeagueID = a.LeagueID
END

SQL Server: Join 2 tables, preferring results from one table where there is a conflict

I have tables that looks like this:-
tblConsuptionsFromA
id meter date total
1 1 03/01/2014 100.1
2 1 04/01/2014 184.1
3 1 05/01/2014 134.1
4 1 06/01/2014 132.4
5 1 07/01/2014 126.1
6 1 08/01/2014 190.1
and...
tblConsuptionsFromB
id meter date total
1 1 01/01/2014 164.1
2 1 02/01/2014 133.1
3 1 03/01/2014 136.1
4 1 04/01/2014 125.1
5 1 05/01/2014 190.1
6 1 06/01/2014 103.1
7 1 07/01/2014 164.1
8 1 08/01/2014 133.1
9 1 09/01/2014 136.1
10 1 10/01/2014 125.1
11 1 11/01/2014 190.1
I need to join these two tables, but if there is an entry for the same day in both table... only take the result from tblConsumptionsFromA.
So the result would be:-
id source_id meter from date total
1 1 1 B 01/01/2014 164.1
2 2 1 B 02/01/2014 133.1
3 1 1 A 03/01/2014 100.1
4 2 1 A 04/01/2014 184.1
5 3 1 A 05/01/2014 134.1
6 4 1 A 06/01/2014 132.4
7 5 1 A 07/01/2014 126.1
8 6 1 A 08/01/2014 190.1
9 9 1 B 09/01/2014 136.1
10 10 1 B 10/01/2014 125.1
11 11 1 B 11/01/2014 190.1
This is beyond me, so if someone can solve... I will be very impressed.

Here's one way to do it:
SELECT
COALESCE(a.source_id,b.source_id) as source_id,
COALESCE(a.meter,b.meter) as meter,
COALESCE(a.[from],b.[from]) as [from],
COALESCE(a.[date],b.[date]) as [date],
COALESCE(a.total,b.total)
FROM (select source_id,meter,'b' as [from],[date],total
from tblConsuptionsFromB) b
left join
(select source_id,meter,'a' as [from],[date],total
from tblConsuptionsFromA) a
on
a.meter = b.meter and
a.[date] = b.[date]
Unfortunately, there's no shorthand like COALESCE(a.*,b.*) to apply the COALESCE to all columns

The UNION operator is used to combine the result-set of two or more SELECT statements.
SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2;
The document of UNION is here:
http://www.w3schools.com/sql/sql_union.asp
And ROW_NUMBER() returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
ROW_NUMBER ( )
OVER ( [ PARTITION BY value_expression , ... [ n ] ] order_by_clause )
The document of ROW_NUMBER() is here:
http://technet.microsoft.com/en-us/library/ms186734.aspx
The following SQL statement uses UNION to select all records from the "tblConsuptionsFromA" and part of records from "tblConsuptionsFromB" tables.
SELECT ROW_NUMBER() OVER(ORDER BY DATE ASC) AS 'id',
id AS 'source_id',meter, date,t AS 'from',total
FROM(
SELECT id,meter, date, 'A' AS t, total FROM tblConsuptionsFromA
UNION
SELECT id,meter, date, 'B' AS t,total FROM tblConsuptionsFromB
WHERE NOT date IN (SELECT date FROM tblConsuptionsFromA)
) AS C;
Hope this helps.

select ta.id, tb.id, ta.meter,
if(ta.date is null, 'B', 'A') as from,
if(ta.date is null, tb.date, ta.date) as date,
if(ta.date is null, tb.total, ta.total) as total
from tblConsuptionsFromA ta
full join tblConsuptionsFromB tb on ta.date=tb.date

You would need to do a Union of the 2 tables, and exclude records from tabletblConsuptionsFromB which are present in tblConsuptionsFromA, something like:
Select Id, Source_ID, meter, 'A' From, Date, Total
FROM tblConsuptionsFromA
Union All
Select Id, Source_ID, meter, 'B' From, Date, Total
FROM tblConsuptionsFromB
Where Date NOT EXISTS (Select Date from tblConsuptionsFromA)

T-SQL Reverse Pivot on every character of a string

We have a table like below in an sql server 2005 db:
event_id staff_id weeks
1 1 NNNYYYYNNYYY
1 2 YYYNNNYYYNNN
2 1 YYYYYYYYNYYY
This is from a piece of timetabling software and is basically saying which staff members are assigned to an event (register) and the set of weeks they are teaching that register. So staff_id 1 isn't teaching the first 3 weeks of event 1 but is teaching the following 4....
Is there an easy way to convert that to an easier form such as:
event_id staff_id week
1 1 4
1 1 5
1 1 6
1 1 7
1 1 10
1 1 11
1 1 12
1 2 1
1 2 2
1 2 3
1 2 7
1 2 8
1 2 9
2 1 1
2 1 2
2 1 3
2 1 4
2 1 5
2 1 6
2 1 7
2 1 8
2 1 10
2 1 11
2 1 12

WITH cte AS
(
SELECT 1 AS [week]
UNION ALL
SELECT [week] + 1
FROM cte
WHERE [week] < 53
)
SELECT t.event_id, t.staff_id, cte.[week]
FROM your_table AS t
INNER JOIN cte
ON LEN(ISNULL(t.weeks, '')) >= cte.[week]
AND SUBSTRING(t.weeks, cte.[week], 1) = 'Y'
ORDER BY t.event_id, t.staff_id, cte.[week]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Join Multiple Tables with One to Many Relationships without "Duplication" - sql

Related

Join 3 tables in bigquery with no duplication

Shift a column based on id and time sql server

How to use Dense Rank and automatically generate dates

SQL Server: Join 2 tables, preferring results from one table where there is a conflict

T-SQL Reverse Pivot on every character of a string

Categories

Resources