SQL Server retrieving multiple columns with rank 1 - sql

I will try to describe my issue as clearly as possible.
I have a dataset of unique 1000 clients, say ##temp1.
I have another dataset which holds the information related to the 1000 clients from ##temp1 across the past 7 years. Lets call this dataset ##temp2. There are 6 specific columns in this second dataset (##temp2) that I am interested in, lets call them column A, B, C, D, E, F. Just for information, the information that columns A, C, E hold is year of some form in data type float (2012, 2013, 2014..) and the information that columns B, D, F hold is ratings of some form (1,2,3,..upto 5) in data type float. Both year and ratings columns have NULL values which I have converted to 0 for now.
My eventual goal is to create a report which holds the information for the 1000 clients in ##temp1, such that each row should hold the information in the following form,
ClientID | ClientName | ColA_Latest_Year1 | ColB_Corresponding_Rating_Year_1 | ColC_Latest_Year2 | ColD_Corresponding_Rating_Year_2 | ColE_Latest_Year3 | ColF_Corresponding_Rating_Year3.
ColA_Latest_Year1 should hold the latest year for that particular client from dataset ##temp2 and ColB_Corresponding_Rating_Year_1 should hold the rating from Column B corresponding to the year pulled in from Column A. Same goes for the other columns.
The approach which I have taken so far, was,
Create ##temp1 as needed
Create ##temp2 as needed
##temp1 LEFT JOIN ##temp2 on client ids to retrieve the year and rating information for all the clients in ##temp1 and put all that information in ##temp3. There will be multiple rows for every client in ##temp3 because the data in ##temp3 is for multiple years.
Ranked the year column (B,D,F) partition by client_ids and put in in ##temp4,
What I have now is something like this,
Rnk_A | Rnk_C | Rnk_F | ColA | ColB | ColC | ColD | ColE | ColF | Client_id | Client_name
2 | 1 | 1 | 0 | 0 | 0 | 0 | 2014 | 1 | 111 | 'ABC'
1 | 2 | 1 | 2012 | 1 | 0 | 0 | 0 | 0 | 111 | 'ABC'
My goal is
Rnk_A | Rnk_C | Rnk_F | ColA | ColB | ColC | ColD | ColE | ColF | Client_id | Client_name
1 | 1 | 1 | 2012| 1 | 0 | 0 | 2014| 1 | 111 | 'ABC'
Any help is appreciated.

This answer assumes you don't have any duplicates per client in columns A, C, E. If you do have duplicates you'd need to find a way to differentiate them and make the necessary changes.
The hurdle that you've failed to overcome in your attempt (as described) is that you're trying to join from temp1 to temp2 only once for lookup information that could come from 3 distinct rows of temp2. This cannot work as you hope. You must perform separate joins for each pair [A,B] [C,D] and [E,F]. The following demonstrates a solution using CTEs to derive the lookup data for each pair.
/********* Prepare sample tables and data ***********/
declare #t1 table (
ClientId int,
ClientName varchar(50)
)
declare #t2 table (
ClientId int,
ColA datetime,
ColB float,
ColC datetime,
ColD float,
ColE datetime,
ColF float
)
insert into #t1
select 1, 'Client 1' union all
select 2, 'Client 2' union all
select 3, 'Client 3' union all
select 4, 'Client 4'
insert into #t2
select 1, '20001011', 1, '20010101', 7, '20130101', 14 union all
select 1, '20040101', 4, '20170101', 1, '20120101', 1 union all
select 1, '20051231', 0, '20020101', 15, '20110101', 1 union all
select 2, '20060101', 2, NULL, 15, '20110101', NULL union all
select 2, '20030101', 3, NULL, NULL, '20100101', 17 union all
select 3, NULL, NULL, '20170101', 42, NULL, NULL
--select * from #t1
--select * from #t2
/********* Solution ***********/
;with MaxA as (
select ROW_NUMBER() OVER (PARTITION BY t2.ClientId ORDER BY t2.ColA DESC) rn,
t2.ClientId, t2.ColA, t2.ColB
from #t2 t2
--where t2.ColA is not null and t2.ColB is not null
), MaxC as (
select ROW_NUMBER() OVER (PARTITION BY t2.ClientId ORDER BY t2.ColC DESC) rn,
t2.ClientId, t2.ColC, t2.ColD
from #t2 t2
--where t2.ColC is not null and t2.ColD is not null
), MaxE as (
select ROW_NUMBER() OVER (PARTITION BY t2.ClientId ORDER BY t2.ColE DESC) rn,
t2.ClientId, t2.ColE, t2.ColF
from #t2 t2
--where t2.ColE is not null and t2.ColF is not null
)
select t1.ClientId, t1.ClientName, a.ColA, a.ColB, c.ColC, c.ColD, e.ColE, e.ColF
from #t1 t1
left join MaxA a on
a.ClientId = t1.ClientId
and a.rn = 1
left join MaxC c on
c.ClientId = t1.ClientId
and c.rn = 1
left join MaxE e on
e.ClientId = t1.ClientId
and e.rn = 1
If you run this you may notice some peculiar results for Client 2 in columns C and F. This is because (as per your question) there may be some NULL values. ColC date is "unknown" and ColF rating is "unknown".
My solution preserves NULL values instead of converting them to zeroes. This allows you to handle them explicitly if you so choose. I commented out lines in the above query that could be used to ignore NULL dates and ratings if necessary.

Related

Daily record count based on status allocation

I have a table named Books and a table named Transfer with the following structure:
CREATE TABLE Books
(
BookID int,
Title varchar(150),
PurchaseDate date,
Bookstore varchar(150),
City varchar(150)
);
INSERT INTO Books VALUES (1, 'Cujo', '2022-02-01', 'CentralPark1', 'New York');
INSERT INTO Books VALUES (2, 'The Hotel New Hampshire', '2022-01-08', 'TheStrip1', 'Las Vegas');
INSERT INTO Books VALUES (3, 'Gorky Park', '2022-05-19', 'CentralPark2', 'New York');
CREATE TABLE Transfer
(
BookID int,
BookStatus varchar(50),
TransferDate date
);
INSERT INTO Transfer VALUES (1, 'Rented', '2022-11-01');
INSERT INTO Transfer VALUES (1, 'Returned', '2022-11-05');
INSERT INTO Transfer VALUES (1, 'Rented', '2022-11-06');
INSERT INTO Transfer VALUES (1, 'Returned', '2022-11-09');
INSERT INTO Transfer VALUES (2, 'Rented', '2022-11-03');
INSERT INTO Transfer VALUES (2, 'Returned', '2022-11-09');
INSERT INTO Transfer VALUES (2, 'Rented', '2022-11-15');
INSERT INTO Transfer VALUES (2, 'Returned', '2022-11-23');
INSERT INTO Transfer VALUES (3, 'Rented', '2022-11-14');
INSERT INTO Transfer VALUES (3, 'Returned', '2022-11-21');
INSERT INTO Transfer VALUES (3, 'Rented', '2022-11-25');
INSERT INTO Transfer VALUES (3, 'Returned', '2022-11-29');
See fiddle.
I want to do a query for a date interval (in this case 01.11 - 09.11) that returns the book count for each day based on BookStatus from Transfer, like so:
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Status | 01.11 | 02.11 | 03.11 | 04.11 | 05.11 | 06.11 | 07.11 | 08.11 | 09.11 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Rented | 2 | 1 | 2 | 2 | 0 | 2 | 3 | 3 | 1 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Returned | 1 | 2 | 1 | 1 | 3 | 1 | 0 | 0 | 2 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
A book remains rented as long as it was not returned, and is counted as 'Returned' every day until it is rented out again.
This is what the query result would look like for one book (BookID 1):
I see two possible solutions.
Dynamic solution
Use a (recursive) common table expression to generate a list of all the dates that fall within the requested range.
Use two cross apply statements that each perform a count() aggregation function to count the amount of book transfers.
-- generate date range
with Dates as
(
select convert(date, '2022-11-01') as TransferDate
union all
select dateadd(day, 1, d.TransferDate)
from Dates d
where d.TransferDate < '2022-11-10'
)
select d.TransferDate,
c1.CountRented,
c2.CountReturned
from Dates d
-- count all rented books up till today, that have not been returned before today
cross apply ( select count(1) as CountRented
from Transfer t1
where t1.BookStatus = 'Rented'
and t1.TransferDate <= d.TransferDate
and not exists ( select 'x'
from Transfer t2
where t2.BookId = t1.BookId
and t2.BookStatus = 'Returned'
and t2.TransferDate > t1.TransferDate
and t2.TransferDate <= d.TransferDate ) ) c1
-- count all returned books for today
cross apply ( select count(1) as CountReturned
from Transfer t1
where t1.BookStatus = 'Returned'
and t1.TransferDate = d.TransferDate ) c2;
Result:
TransferDate CountRented CountReturned
------------ ----------- -------------
2022-11-01 1 0
2022-11-02 1 0
2022-11-03 2 0
2022-11-04 2 0
2022-11-05 1 1
2022-11-06 2 0
2022-11-07 2 0
2022-11-08 2 0
2022-11-09 0 2
2022-11-10 0 0
This result is not the pivoted outcome described in the question. However, pivoting this dynamic solution requires dynamic sql, which is not trivial!
Static solution
This will delivery the exact outcome as described in the question (including the date formatting), but requires the date range to be fully typed out once.
The essential building blocks are similar to the dynamic solution above:
A recursive common table expression to generate a date range.
Two cross apply's to perform the counting calculations like before.
There is also:
An extra cross join to duplicate the date range for each BookStatus (avoid NULL values in the result).
Some replace(), str() and datepart() functions to format the dates.
A case expression to merge the two counts to a single column.
The solution is probably not the most performant, but it does deliver the requested result. If you want to validate for BookID=1 then just uncomment the extra WHERE filter clauses.
with Dates as
(
select convert(date, '2022-11-01') as TransferDate
union all
select dateadd(day, 1, d.TransferDate)
from Dates d
where d.TransferDate < '2022-11-10'
),
PivotInput as
(
select replace(str(datepart(day, d.TransferDate), 2), space(1), '0') + '.' + replace(str(datepart(month, d.TransferDate), 2), space(1), '0') as TransferDate,
s.BookStatus as [Status],
case when s.BookStatus = 'Rented' then sc1.CountRented else sc2.CountReturned end as BookStatusCount
from Dates d
cross join (values('Rented'), ('Returned')) s(BookStatus)
cross apply ( select count(1) as CountRented
from Transfer t1
where t1.BookStatus = s.BookStatus
and t1.TransferDate <= d.TransferDate
--and t1.BookID = 1
and not exists ( select 'x'
from Transfer t2
where t2.BookId = t1.BookId
and t2.BookStatus = 'Returned'
and t2.TransferDate > t1.TransferDate
and t2.TransferDate <= d.TransferDate ) ) sc1
cross apply ( select count(1) as CountReturned
from Transfer t3
where t3.TransferDate = d.TransferDate
--and t3.BookID = 1
and t3.BookStatus = 'Returned' ) sc2
)
select piv.*
from PivotInput pivi
pivot (sum(pivi.BookStatusCount) for pivi.TransferDate in (
[01.11],
[02.11],
[03.11],
[04.11],
[05.11],
[06.11],
[07.11],
[08.11],
[09.11],
[10.11])) piv;
Result:
Status 01.11 02.11 03.11 04.11 05.11 06.11 07.11 08.11 09.11 10.11
Rented 1 1 2 2 1 2 2 2 0 0
Returned 0 0 0 0 1 0 0 0 2 0
Fiddle to see things in action.

How to find duplicate sets of values in column SQL

I have a database table in SQL Server like this:
+----+--------+
| ID | Number |
+----+--------+
| 1 | 4 |
| 2 | 2 |
| 3 | 6 |
| 4 | 5 |
| 5 | 3 |
| 6 | 2 |
| 7 | 6 |
| 8 | 4 |
| 9 | 5 |
| 10 | 1 |
| 11 | 6 |
| 12 | 4 |
| 13 | 2 |
| 14 | 6 |
+----+--------+
I want to get all values ​​of rows that are the same with last row or last 2 rows or last 3 rows or .... in column Number, and when finding those values, will go on to get the values ​​that appear next and count the number its appearance.
Result output like this:
If the same with the last row:
We see that the number next to 6 in column Number is 4 and 5.
Times appear in column Number of pair 6,4 is 2 and pair 6,5 is 1.
+---------------------+-------------------------+--------------+
| "Condition to find" | "Next Number in column" | Times appear |
+---------------------+-------------------------+--------------+
| 6 | 5 | 1 |
| 6 | 4 | 2 |
+---------------------+-------------------------+--------------+
If the same with the last two rows:
+---------------------+-------------------------+--------------+
| "Condition to find" | "Next Number in column" | Times appear |
+---------------------+-------------------------+--------------+
| 2,6 | 5 | 1 |
| 2,6 | 4 | 1 |
+---------------------+-------------------------+--------------+
If the same with the last 3 rows:
+---------------------+-------------------------+--------------+
| "Condition to find" | "Next Number in column" | Times appear |
+---------------------+-------------------------+--------------+
| 4,2,6 | 5 | 1 |
+---------------------+-------------------------+--------------+
And if the last 4,5,6...rows, find until Times appear returns 0
+---------------------+-------------------------+--------------+
| "Condition to find" | "Next Number in column" | Times appear |
+---------------------+-------------------------+--------------+
| 6,4,2,6 | | |
+---------------------+-------------------------+--------------+
Any idea how to get this. Thank so much!
Here's an answer which uses the 'Lead' function - which (once ordered) takes a value from a certain number of rows ahead.
It converts your table with 1 number, to also include the next 3 numbers on each row.
Then you can join on those columns to get numbers etc.
CREATE TABLE #Src (Id int PRIMARY KEY, Num int)
INSERT INTO #Src (Id, Num) VALUES
( 1, 4),
( 2, 2),
( 3, 6),
( 4, 5),
( 5, 3),
( 6, 2),
( 7, 6),
( 8, 4),
( 9, 5),
(10, 1),
(11, 6),
(12, 4),
(13, 2),
(14, 6)
CREATE TABLE #SrcWithNext (Id int PRIMARY KEY, Num int, Next1 int, Next2 int, Next3 int)
-- First step - use LEAD to get the next 1, 2, 3 values
INSERT INTO #SrcWithNext (Id, Num, Next1, Next2, Next3)
SELECT ID, Num,
LEAD(Num, 1, NULL) OVER (ORDER BY Id) AS Next1,
LEAD(Num, 2, NULL) OVER (ORDER BY Id) AS Next2,
LEAD(Num, 3, NULL) OVER (ORDER BY Id) AS Next3
FROM #Src
SELECT * FROM #SrcWithNext
/* Find number with each combination */
-- 2 chars
SELECT A.Num, A.Next1, COUNT(*) AS Num_Instances
FROM (SELECT DISTINCT Num, Next1 FROM #SrcWithNext) AS A
INNER JOIN #SrcWithNext AS B ON A.Num = B.Num AND A.Next1 = B.Next1
WHERE A.Num <= B.Num
GROUP BY A.Num, A.Next1
ORDER BY A.Num, A.Next1
-- 3 chars
SELECT A.Num, A.Next1, A.Next2, COUNT(*) AS Num_Instances
FROM (SELECT DISTINCT Num, Next1, Next2 FROM #SrcWithNext) AS A
INNER JOIN #SrcWithNext AS B
ON A.Num = B.Num
AND A.Next1 = B.Next1
AND A.Next2 = B.Next2
WHERE A.Num <= B.Num
GROUP BY A.Num, A.Next1, A.Next2
ORDER BY A.Num, A.Next1, A.Next2
-- 4 chars
SELECT A.Num, A.Next1, A.Next2, A.Next3, COUNT(*) AS Num_Instances
FROM (SELECT DISTINCT Num, Next1, Next2, Next3 FROM #SrcWithNext) AS A
INNER JOIN #SrcWithNext AS B
ON A.Num = B.Num
AND A.Next1 = B.Next1
AND A.Next2 = B.Next2
AND A.Next3 = B.Next3
WHERE A.Num <= B.Num
GROUP BY A.Num, A.Next1, A.Next2, A.Next3
ORDER BY A.Num, A.Next1, A.Next2, A.Next3
Here's a db<>fiddle to check.
Notes
The A.Num <= B.Num means it finds all matches to itself, and then only counts others once
This answer finds all combinations. To filter, it currently would need to filter as separate columns e.g., instead of 2,6, you'd filter on Num = 2 AND Next1 = 6. Feel free to then do various text/string concatenation functions to create references for your preferred search/filter approach.
Hmmm . . . I am thinking that you want to create the "pattern to find" as a string. Unfortunately, string_agg() is not a windowing function, but you can use apply:
select t.*, p.*
from t cross apply
(select string_agg(number, ',') within group (order by id) as pattern
from (select top (3) t2.*
from t t2
where t2.id <= t.id
order by t2.id desc
) t2
) p;
You would change the "3" to whatever number of rows that you want.
Then you can use this to identify the rows where the patterns are matched and aggregate:
with tp as (
select t.*, p.*
from t cross apply
(select string_agg(number, ',') within group (order by id) as pattern
from (select top (3) t2.*
from t t2
where t2.id <= t.id
order by t2.id desc
) t2
) p
)
select pattern_to_find, next_number, count(*)
from (select tp.*,
first_value(pattern) over (order by id desc) as pattern_to_find,
lead(number) over (order by id) as next_number
from tp
) tp
where pattern = pattern_to_find
group by pattern_to_find, next_number;
Here is a db<>fiddle.
If you are using an older version of SQL Server -- one that doesn't support string_agg() -- you can calculate the pattern using lag():
with tp as (
select t.*,
concat(lag(number, 2) over (order by id), ',',
lag(number, 1) over (order by id), ',',
number
) as pattern
from t
)
Actually, if you have a large amount of data, it would be interesting to know which is faster -- the apply version or the lag() version. I suspect that lag() might be faster.
EDIT:
In unsupported versions of SQL Server, you can get the pattern using:
select t.*, p.*
from t cross apply
(select (select cast(number as varchar(255)) + ','
from (select top (3) t2.*
from t t2
where t2.id <= t.id
order by t2.id desc
) t2
order by t2.id desc
for xml path ('')
) as pattern
) p
You can use similar logic for lead().
I tried to solve this problem by converting "Number" column to a string.
Here is my code using a function with input of "number of last selected rows":
(Be careful that the name of the main table is "test" )
create function duplicate(#nlast int)
returns #temp table (RowNumbers varchar(20), Number varchar(1))
as
begin
declare #num varchar(20)
set #num=''
declare #count int=1
while #count <= (select count(id) from test)
begin
set #num = #num + cast((select Number from test where #count=ID) as varchar(20))
set #count=#count+1
end
declare #lastnum varchar(20)
set #lastnum= (select RIGHT(#num,#nlast))
declare #count2 int=1
while #count2 <= len(#num)-#nlast
begin
if (SUBSTRING(#num,#count2,#nlast) = #lastnum)
begin
insert into #temp
select #lastnum ,SUBSTRING(#num,#count2+#nlast,1)
end
set #count2=#count2+1
end
return
end
go
select RowNumbers AS "Condition to find", Number AS "Next Number in column" , COUNT(Number) AS "Times appear" from dbo.duplicate(2)
group by Number, RowNumbers

Displaying whole table after stripping characters in SQL Server

This question has 2 parts.
Part 1
I have a table "Groups":
group_ID person
-----------------------
1 Person 10
2 Person 11
3 Jack
4 Person 12
Note that not all data in the "person" column have the same format.
In SQL Server, I have used the following query to strip the "Person " characters out of the person column:
SELECT
REPLACE([person],'Person ','')
AS [person]
FROM Groups
I did not use UPDATE in the query above as I do not want to alter the data in the table.
The query returned this result:
person
------
10
11
12
However, I would like this result instead:
group_ID person
-------------------
1 10
2 11
3 Jack
4 12
What should be my query to achieve this result?
Part 2
I have another table "Details":
detail_ID group1 group2
-------------------------------
100 1 2
101 3 4
From the intended result in Part 1, where the numbers in the "person" column correspond to those in "group1" and "group2" of table "Details", how do I selectively convert the numbers in "person" to integers and join them with "Details"?
Note that all data under "person" in Part 1 are strings (nvarchar(100)).
Here is the intended query output:
detail_ID group1 group2
-------------------------------
100 10 11
101 Jack 12
Note that I do not wish to permanently alter anything in both tables and the intended output above is just a result of a SELECT query.
I don't think first part will be a problem here. Your query is working fine with your expected result.
Schema:
CREATE TABLE #Groups (group_ID INT, person VARCHAR(50));
INSERT INTO #Groups
SELECT 1,'Person 10'
UNION ALL
SELECT 2,'Person 11'
UNION ALL
SELECT 3,'Jack'
UNION ALL
SELECT 4,'Person 12';
CREATE TABLE #Details(detail_ID INT,group1 INT, group2 INT);
INSERT INTO #Details
SELECT 100, 1, 2
UNION ALL
SELECT 101, 3, 4 ;
Part 1:
For me your query is giving exactly what you are expecting
SELECT group_ID,REPLACE([person],'Person ','') AS person
FROM #Groups
+----------+--------+
| group_ID | person |
+----------+--------+
| 1 | 10 |
| 2 | 11 |
| 3 | Jack |
| 4 | 12 |
+----------+--------+
Part 2:
;WITH CTE AS(
SELECT group_ID
,REPLACE([person],'Person ','') AS person
FROM #Groups
)
SELECT D.detail_ID, G1.person, G2.person
FROM #Details D
INNER JOIN CTE G1 ON D.group1 = G1.group_ID
INNER JOIN CTE G2 ON D.group1 = G2.group_ID
Result:
+-----------+--------+--------+
| detail_ID | person | person |
+-----------+--------+--------+
| 100 | 10 | 10 |
| 101 | Jack | Jack |
+-----------+--------+--------+
Try following query, it should give you the desired output.
;WITH MT AS
(
SELECT
GroupId, REPLACE([person],'Person ','') Person
AS [person]
FROM Groups
)
SELECT Detail_Id , MT1.Person AS group1 , MT2.Person AS AS group2
FROM
Details D
INNER JOIN MT MT1 ON MT1.GroupId = D.group1
INNER JOIN MT MT2 ON MT2.GroupId= D.group2
The first query works
declare #T table (id int primary key, name varchar(10));
insert into #T values
(1, 'Person 10')
, (2, 'Person 11')
, (3, 'Jack')
, (4, 'Person 12');
declare #G table (id int primary key, grp1 int, grp2 int);
insert into #G values
(100, 1, 2)
, (101, 3, 4);
with cte as
( select t.id, t.name, ltrim(rtrim(replace(t.name, 'person', ''))) as sp
from #T t
)
-- select * from cte order by cte.id;
select g.id, c1.sp as grp1, c2.sp as grp2
from #G g
join cte c1
on c1.id = g.grp1
join cte c2
on c2.id = g.grp2
order
by g.id;
id grp1 grp2
----------- ----------- -----------
100 10 11
101 Jack 12

SQL:Query to check if a column meets certain criteria, if it does perform one action if it doesn't perform another

I have found it quite hard to word what I want to do in the title so I will try my best to explain now!
I have two tables which I am using:
Master_Tab and Parts_Tab
Parts_Tab has the following information:
Order_Number | Completed| Part_Number|
| 1 | Y | 64 |
| 2 | N | 32 |
| 3 | Y | 42 |
| 1 | N | 32 |
| 1 | N | 5 |
Master_Tab has the following information:
Order_Number|
1 |
2 |
3 |
4 |
5 |
I want to generate a query which will return ALL of the Order_Numbers listed in the Master_Tab on the following conditions...
For each Order_Number I want to check the Parts_Tab table to see if there are any parts which aren't complete (Completed = 'N'). For each Order_Number I then want to count the number of uncompleted parts an order has against it. If an Order_Number does not have uncompleted parts or it is not in the Parts_Table then I want the count value to be 0.
So the table that would be generated would look like this:
Order_Number | Count_of_Non_Complete_Parts|
1 | 2 |
2 | 1 |
3 | 0 |
4 | 0 |
5 | 0 |
I was hoping that using a different kind of join on the tables would do this but I am clearly missing the trick!
Any help is much appreciated!
Thanks.
I have used COALESCE to convert NULL to zero where necessary. Depending on your database platform, you may need to use another method, e.g. ISNULL or CASE.
select mt.Order_Number,
coalesce(ptc.Count, 0) as Count_of_Non_Complete_Parts
from Master_Tab mt
left outer join (
select Order_Number, count(*) as Count
from Parts_Tab
where Completed = 'N'
group by Order_Number
) ptc on mt.Order_Number = ptc.Order_Number
order by mt.Order_Number
You are looking for a LEFT JOIN.
SELECT mt.order_number, count(part_number) AS count_noncomplete_parts
FROM master_tab mt LEFT JOIN parts_tab pt
ON mt.order_number=pt.order_number AND pt.completed='N'
GROUP BY mt.order_number;
It is also possible to put pt.completed='N' into a WHERE clause, but you have to be careful of NULLs. Instead of the AND you can have
WHERE pt.completed='N' OR pr.completed IS NULL
SELECT mt.Order_Number SUM(tbl.Incomplete) Count_of_Non_Complete_Parts
FROM Master_Tab mt
LEFT JOIN (
SELECT Order_Number, CASE WHEN Completed = 'N' THEN 1 ELSE 0 END Incomplete
FROM Parts_Tab
) tbl on mt.Order_Number = tbl.Order_Number
GROUP BY mt.Order_Number
Add a WHERE clause to the outer query if you need to filter for specific order numbers.
I think it's easiest to get a subquery in there. I think this should be self-explanitory, if not feel free to ask any questions.
CREATE TABLE #Parts
(
Order_Number int,
Completed char(1),
Part_Number int
)
CREATE TABLE #Master
(
Order_Number int
)
INSERT INTO #Parts
SELECT 1, 'Y', 64 UNION ALL
SELECT 2, 'N', 32 UNION ALL
SELECT 3, 'Y', 42 UNION ALL
SELECT 1, 'N', 32 UNION ALL
SELECT 1, 'N', 5
INSERT INTO #Master
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6
SELECT M.Order_Number, ISNULL(Totals.NonCompletedCount, 0) FROM #Master M
LEFT JOIN (SELECT P.Order_Number, COUNT(*) AS NonCompletedCount FROM #Parts P
WHERE P.Completed = 'N'
GROUP BY P.Order_Number) Totals ON Totals.Order_Number = M.Order_Number

Query for missing elements

I have a table with the following structure:
timestamp | name | value
0 | john | 5
1 | NULL | 3
8 | NULL | 12
12 | john | 3
33 | NULL | 4
54 | pete | 1
180 | NULL | 4
400 | john | 3
401 | NULL | 4
592 | anna | 2
Now what I am looking for is a query that will give me the sum of the values for each name, and treats the nulls in between (orderd by the timestamp) as the first non-null name down the list, as if the table were as follows:
timestamp | name | value
0 | john | 5
1 | john | 3
8 | john | 12
12 | john | 3
33 | pete | 4
54 | pete | 1
180 | john | 4
400 | john | 3
401 | anna | 4
592 | anna | 2
and I would query SUM(value), name from this table group by name. I have thought and tried, but I can't come up with a proper solution. I have looked at recursive common table expressions, and think the answer may lie in there, but I haven't been able to properly understand those.
These tables are just examples, and I don't know the timestamp values in advance.
Could someone give me a hand? Help would be very much appreciated.
With Inputs As
(
Select 0 As [timestamp], 'john' As Name, 5 As value
Union All Select 1, NULL, 3
Union All Select 8, NULL, 12
Union All Select 12, 'john', 3
Union All Select 33, NULL, 4
Union All Select 54, 'pete', 1
Union All Select 180, NULL, 4
Union All Select 400, 'john', 3
Union All Select 401, NULL, 4
Union All Select 592, 'anna', 2
)
, NamedInputs As
(
Select I.timestamp
, Coalesce (I.Name
, (
Select I3.Name
From Inputs As I3
Where I3.timestamp = (
Select Max(I2.timestamp)
From Inputs As I2
Where I2.timestamp < I.timestamp
And I2.Name Is not Null
)
)) As name
, I.value
From Inputs As I
)
Select NI.name, Sum(NI.Value) As Total
From NamedInputs As NI
Group By NI.name
Btw, what would be orders of magnitude faster than any query would be to first correct the data. I.e., update the name column to have the proper value, make it non-nullable and then run a simple Group By to get your totals.
Additional Solution
Select Coalesce(I.Name, I2.Name), Sum(I.value) As Total
From Inputs As I
Left Join (
Select I1.timestamp, MAX(I2.Timestamp) As LastNameTimestamp
From Inputs As I1
Left Join Inputs As I2
On I2.timestamp < I1.timestamp
And I2.Name Is Not Null
Group By I1.timestamp
) As Z
On Z.timestamp = I.timestamp
Left Join Inputs As I2
On I2.timestamp = Z.LastNameTimestamp
Group By Coalesce(I.Name, I2.Name)
You don't need CTE, just a simple subquery.
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)), t.value
from inputs t
And summing from here
select name, SUM(value) as totalValue
from
(
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)) as name, t.value
from inputs t
) N
group by name
I hope I'm not going to be embarassed by offering you this little recursive CTE query of mine as a solution to your problem.
;WITH
numbered_table AS (
SELECT
timestamp, name, value,
rownum = ROW_NUMBER() OVER (ORDER BY timestamp)
FROM your_table
),
filled_table AS (
SELECT
timestamp,
name,
value
FROM numbered_table
WHERE rownum = 1
UNION ALL
SELECT
nt.timestamp,
name = ISNULL(nt.name, ft.name),
nt.value
FROM numbered_table nt
INNER JOIN filled_table ft ON nt.rownum = ft.rownum + 1
)
SELECT *
FROM filled_table
/* or go ahead aggregating instead */