How to include non-matching rows? - sql

This script is working as intended.
select a.Loc, Count(a.PID) as TotalVisit
from AccountCount as a
inner join Data as b
on a.PID = b.PID
where
cast(a.DateTime as date) between cast(b.ADateTime as date) and cast(b.DDateTime as date)
and year(a.DateTime)=2015
and month(a.DateTime)=05
group by a.Loc
order by a.Loc;
However, I need to include few more PID from Data table. These PID is not in AccountCount table.
select LocID, PID
from Data
where
and cast(ADateTime as date) = cast(DDateTime as date)
and year(ADateTime) = 2015
and month(ADateTime)=05
order by LocID;
In simple terms, I need to do union between the first script and the second script. I tried to right join the Data table but it didn't work.
Using the UNION ALL provided by xQbert, I get the result like.
Loc TotalVisit
1st floor 20
2nd floor 5
3rd floor 8
1st floor 2
It needs to be
Loc TotalVisit
1st floor 22
2nd floor 5
3rd floor 8
Please help.
Thank you.

I would think a right join would work so long as the ON criteria is setup correctly and the Where clause is moved to the join (as it makes the right join an inner join again if left in the where clause. (the outer join results in null records which are excluded by the where clause thus negating the outer join))
The union all doesn't allow for the aggregation of data. To me the outer join is the right thing to do here. We just need to understand the data better to make it work correctly. However, using union all you could simply sum up the results... using an outer query... but now that you've given some sample data I might be able to figure out why the outer join wasn't working)
Using union all ... (I'm about getting it working then improving it)
Select X.Loc, sum(X.TotalVisit) as TotalVisit
from (SELECT a.Loc as LOC, Count(a.PID) as TotalVisit
from AccountCount as a
inner join Data as b
on a.PID = b.PID
where
cast(a.DateTime as date) between cast(b.ADateTime as date) and cast(b.DDateTime as date)
group by a.Loc
UNION ALL
select LocID as LOC, count(PID)
from Data
where
and cast(ADateTime as date) = cast(DDateTime as date)
GROUP BY by LocID
) X
GROUP BY X.Loc
ORDER BY X.LOC
This leads me to this... which I think would work Take the first non-null value of location from AccountCount.Loc and Data.LocID and use it. Notice no where clause...
SELECT Coalesce(A.Loc, B.LocID) as Loc, count(B.PID) as TotalVisit
FROM Data B
LEFT JOIN AccountCount A
on B.PID = A.PID
and (cast(a.DateTime as date) between cast(b.ADateTime as date) and cast(b.DDateTime as date)
OR cast(B.ADateTime as date) = cast(B.DDateTime as date))
GROUP BY Coalesce(A.Loc, B.LocID)
Order by Coalesce(A.Loc, B.LocID)

Related

Why is the SQL full outer join is not presenting unmatched customers (avc_id)?

I appreciate your help in advance!
The right table avc_enr has 108K customers (b.avc_id) in it. In the 2nd table (alias a), we have about 97K customers (a.avc_id).
I tried to use right, left and full outer join but every time the count of customers shows 97K rather than 108K customers (under Total_users)... any idea why with full outer join the count function is not counting all customers even if no common match is found between two tables?
with avc_enr as
(
select
dt, avc_id, service_template_name
from
hive.thor_satellite.v_nms_inventory_nmsdb_avc_service
where
current_status = 'ACTIVE' and dt = 20220809
)
select
a.dt, a.metrics_date,
avg(a.vsat_fl_byte_count_kbps) as AUPU_Kbps,
count(b.avc_id) as Total_users
from
hive.thor_satellite.vda_satellite_nms_performance_smts_avc_pm_throughput a
full outer join
avc_enr b on a.avc_id = b.avc_id and a.dt = b.dt
where
a.dt = 20220809
group by
a.dt, a.metrics_date

LEFT JOIN by closer value condition

I have this query
SELECT
loc.proceso,
loc.codigo_municipio,
loc.codigo_concejo,
loc.concejo,
(CASE
WHEN loc.poblacion IS NOT NULL THEN loc.poblacion
ELSE pob.valor
END) AS poblacion
FROM develop.031401_elecciones_dimension_localizacion_electoral AS loc
LEFT JOIN develop.031401_elecciones_dimension_proceso_electoral AS proc
ON loc.proceso = proc.proceso
LEFT JOIN develop.020101_t05 AS pob
ON loc.codigo_municipio = CAST(pob.cmun AS INT) AND pob.year = proc.anno_eleccion
In the second LEFT JOIN, I would like to change the second condition pob.year = proc.anno_eleccion so that it does not only search for the exact year when joining. Instead, I would like to get the closer year stored in my pob table. For example, the first year stored in pob is 2003, so I want all the entries in loc whose year is lower than 2003 to be matched with that value when performing the join. Also at the inverse, the last year stored in pob is 2020, so I want those entries in loc whose year is 2021 (or even greater), to be matched with the 2020 row from my pob table. When the exact year is contained in pob table, it should be used for the join.
1. If you want the nearest year to NOW
I don't think of a direct join but you can try this one by using ROW_NUMBER() function to sort data by year and pick the first result to join:
(WHERE rn = 1 picks the first index, so it prevents any duplicate)
LEFT JOIN
(SELECT T.* FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY pob.cmun ORDER BY pob.year DESC) AS rn,
*
FROM develop.020101_t05) AS T
WHERE rn = 1) AS pob
ON loc.codigo_municipio = CAST(pob.cmun AS INT) AND pob.year = proc.anno_eleccion
2. If you want the nearest year to your data
Even it's not best practice, you can join your data using comparison operators on join condition. Then, take the difference between two years, sort the difference ascending and pick the first result using ROW_NUMBER() function. See example:
SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY a.Id ORDER BY a.Year - b.Year) AS RowNumber,
a.Id,
a.Year,
b.Year,
a.Year - b.Year AS YearDiff
FROM a
LEFT JOIN b ON a.Id = b.Id AND a.Year >= b.Year) AS T
WHERE RowNumber = 1

Return null values in GROUP BY query

This only returns results for entries that exist, which makes sense, but I'm trying to get it to display all of the B.[HostName] entries, even if there aren't entries for all of them. I'd like it to show 0 under the count for those. I've read about needing to use a LEFT JOIN on some of the tables and changing the COUNT(*) to use a field instead, but whenever I have tried that, it still results in the same data. Can anyone show me what would need to be changed for this to work the way I mentioned?
EDIT:
To clarify, since the initial answers didn't work:
The count would be coming from this table: [Database].[dbo].[WorkItemHistory] A -- we want all of the entries of A.[PlatformId]
A.[PlatformId] is a foreign key to C.[Id]
AND C.[EngineId] is a foreign key to B.[Id]
From there, we are getting B.[HostName]
So, all entries for B.[HostName] would be listed in the output and the count would come from the entries in the A.[WorkitemHistory] table.
We are using
SELECT DISTINCT B.[HostName], COUNT(*) AS Count
FROM [Database].[dbo].[WorkItemHistory] A, [Database].[dbo].[Engine] B, [Database].[dbo].[Platforms] C
WHERE A.[PlatformId] = C.[Id]
AND B.[Id] = C.[EngineId]
AND A.[Status] = '30'
AND A.[LastAttemptDateTime] >= CAST(GETDATE() AS Date)
GROUP BY B.[HostName]
ORDER BY COUNT(*) DESC
Replace your JOIN with a RIGHT JOIN B like below
to give values of B that doesnt have any common values of B in others.
SELECT B.[HostName], COUNT(*) AS Count
FROM [Database].[dbo].[WorkItemHistory] A
JOIN [Database].[dbo].[Platforms] C
On A.[PlatformId] = C.[Id]
RIGHT JOIN [Database].[dbo].[Engine] B
On C.[id]=B.[EngineId]
Where A.[Status] = '30'
AND A.[LastAttemptDateTime] >= CAST(GETDATE() AS Date)
GROUP BY B.[HostName]
ORDER BY COUNT(*) DESC
I think Something like this will do it. You just need to left join the other tables to [Database].[dbo].[Engine] since you want all the hostnames.
SELECT B.[HostName],
COUNT(C.ID) AS Count
FROM [Database].[dbo].[Engine] B
LEFT JOIN [Database].[dbo].[Platforms] C
ON B.[Id] = C.[EngineId]
LEFT JOIN [Database].[dbo].[WorkItemHistory] A
ON A.[PlatformId] = C.[Id]
AND A.[Status] = '30'
AND AND A.[LastAttemptDateTime] >= CAST(GETDATE() AS Date)
GROUP BY B.[HostName]
ORDER BY COUNT(*) DESC

Why is my Left Not pulling all the dates even though they exist on the other table SQL

I have a table without dates and wish to join on the table with dates.I am doing a left join on id and bn_number. The id can have more than one dates , i obviously want the latest date from the other tables as it has more than one date for each id. i am not sure how to get all the dates at least then i can be able to choose the latest one.
select Reg_Property_id,a.Bnd_nbr,account_balance,abs(account_balanc‌​e) as Bond_Balance,a.Bnd_regDate
into #Jan2014ValidFin
from #Jan2014Valid aa
left join Pr_analytics..bond a
on aa.Reg_Property_id=a.Prop_id
and aa.bnd_nbr=a.Bnd_nbr
where aa.reg_property_id is not null
SQL
Please assist.
Use the ROW_NUMBER() window function to get the most recent date:
SELECT c.*
FROM (
SELECT a.cols, b.cols, ROW_NUMBER() OVER (PARTITION BY b.colID1,b.colID2 ORDER BY b.theDate DESC) AS rn
FROM a
LEFT OUTER JOIN b ON a.col1 = b.col1
AND a.col2 = b.col2
) c
WHERE c.rn = 1
A simple group by should do the trick:
SELECT
Reg_Property_id -- What table is this from?
,a.Bnd_nbr
,account_balance -- What table is this from?
,abs(account_balance) as Bond_Balance -- What table is this from?
,max(a.Bnd_regDate) as Bnd_regDate
into #Jan2014ValidFin
from #Jan2014Valid aa
left join Pr_analytics..bond a
on aa.Reg_Property_id = a.Prop_id
and aa.bnd_nbr = a.Bnd_nbr
where aa.reg_property_id is not null
group by
Reg_Property_id
,a.Bnd_nbr
,account_balance
,abs(account_balance)
Note that if there are no dates (a.Bnd_regDate), you will get NULL
Note also that if any of the values marked "what table is this from" are found in #Jan2014Valid, you will need to either aggregate them (max, sum, etc.) or include them in the group by clause--I can't tell which, from the information provided.

SQL Count on multiple joins with dynamic WHERE

My issue is that I have a Select statement that has a where clause that is generated on the fly. It is joined across 5 tables.
I basically need a Count of each DISTINCT instance of a USER ID in table 1 that falls into the scope of the WHERE. This has to be able to be executed in one statement as well. So, Esentially, I can't do a global GROUP BY because of the other 4 tables data I need returned.
If I could get a column that had the count that was duplicated where the primary key column is that would be perfect. Right now this is what I'm looking at as my query:
SELECT *
FROM TBL1 1
INNER JOIN TBL2 2 On 2.FK = 1.FK
INNER JOIN TBL3 3 On 3.PK = 2.PK INNER JOIN TBL4 4 On 4.PK = 3.PK
LEFT OUTER JOIN TBL5 5 ON 4.PK = 5.PK
WHERE 1.Date_Time_In BETWEEN '2010-11-15 12:00:00' AND '2010-11-30 12:00:00'
ORDER BY
4.Column
, 3.Column
, 3.Column2
, 1.Date_Time_In DESC
So instead of selecting all columns, I will be filtering it down to about 5 or 6 but with that I need something like a Total column that is the Distinct count of TBL1's Primary Key that applies the WHERE clause that has a possibility of growing and shrinking in size.
I almost wish there was a way to apply the same WHERE clause to a subselect because I realize that would work but don't know of a way other than creating a variable and just placing it in both places which I can't do either.
If you are using SQL Server 2005 or higher, you could use one of the AGGREGATE OVER functions.
SELECT *
, COUNT(UserID) OVER(PARTITION BY UserID) AS 'Total'
FROM TBL1 1
INNER JOIN TBL2 2 On 2.FK = 1.FK
INNER JOIN TBL3 3 On 3.PK = 2.PK INNER JOIN TBL4 4 On 4.PK = 3.PK
LEFT OUTER JOIN TBL5 5 ON 4.PK = 5.PK
WHERE 1.Date_Time_In BETWEEN '2010-11-15 12:00:00' AND '2010-11-30 12:00:00'
ORDER BY
4.Column, 3.Column, 3.Column2, 1.Date_Time_In DESC
something like adding:
inner join (select pk, count(distinct user_id) from tbl1 WHERE Date_Time_In BETWEEN '2010-11-15 12:00:00' AND '2010-11-30 12:00:00') as tbl1too on 1.PK = tbl1too.PK