I have a query that I cannot get to work right. I have 3 tables; Person, PersonProgram and Category.
Person: ID, ....
PersonProgram: ID, PersonID, Category, Code ...
Category: ID, ...
The Person table has 1 record for each person and the PersonProgram has multiple programs per person. There are 4 categories and I need to pull into a single row, of each person, with specific Program of each category.
Person Table:
1
2
3
PersonProgram Table
1, 1, 1, 1
2, 1, 2, 1
3, 1, 1, 3
4, 2, 1, 1
5, 2, 3, 3
What the desired outcome should be:
PersonID, ProgramIDforCat1, ProgramIDforCat2, ProgramIDforCat3, ProgramIDforCat4
1, 1, 2, NULL, NULL
2, 1, NULL, 3, NULL
The problem is that there is multiple Program records for each person and category with a code of 1, 2 or 3. I need to put priority on Code 1 then Code 3 and ignore the rest, while still only pulling 1 record, or NULL if it does not exist.
I am losing it trying to get this to work.
FYI, it has to be in a view.
Thanks for any help.
WITH Person AS
(
SELECT 1 AS ID UNION ALL
SELECT 2 AS ID UNION ALL
SELECT 3 AS ID
),
PersonProgram AS
(
SELECT 1 AS ID, 1 AS PersonID, 1 AS Category, 1 AS Code UNION ALL
SELECT 2, 1, 2, 1 UNION ALL
SELECT 3, 1, 1, 3 UNION ALL
SELECT 4, 2, 1, 1 UNION ALL
SELECT 5, 2, 3, 3
),
pp2 AS
(
SELECT *
,ROW_NUMBER() OVER
(PARTITION BY PersonID, Category
ORDER BY CASE WHEN Code = 1 THEN 0 ELSE 1 END,
CASE WHEN Code = 3 THEN 0 ELSE 1 END) AS RN
FROM PersonProgram
)
select PersonID ,
max(case when Category =1 then pp2.ID end) ProgramIDforCat1,
max(case when Category =2 then pp2.ID end) ProgramIDforCat2,
max(case when Category =3 then pp2.ID end) ProgramIDforCat3,
max(case when Category =4 then pp2.ID end) ProgramIDforCat4
from Person p join pp2
on pp2.PersonID = p.ID
WHERE RN=1
group by PersonID
Returns
PersonID ProgramIDforCat1 ProgramIDforCat2 ProgramIDforCat3 ProgramIDforCat4
----------- ---------------- ---------------- ---------------- ----------------
1 1 2 NULL NULL
2 4 NULL 5 NULL
This is different from your expected results. (though I can make it the same by using pp2.Category rather than pp2.ID) Can you clarify?
Related
I want to count how many houses are within a building. Dataset like the following:
BuildingID, HouseID
1, 1
1, 2
1, 3
2, 4
2, 5
2, 6
NULL, 7
NULL, 8
With the following code it shows the total count of the houses, however, houses 7 and 8 don't have a building, so it shouldn't count anything.
SELECT BuildingID
, HouseID
, COUNT(HouseID) OVER (PARTITION BY BuildingID) AS 'Houses in Building'
FROM BUILDING
The result I get:
BuildingID, HouseID, Houses in Building
1, 1, 3
1, 2, 3
1, 3, 3
2, 4, 3
2, 5, 3
2, 6, 3
NULL, 7, 2
NULL, 8, 2
The result I want:
BuildingID, HouseID, Houses in Building
1, 1, 3
1, 2, 3
1, 3, 3
2, 4, 3
2, 5, 3
2, 6, 3
NULL, 7, NULL --or 0
NULL, 8, NULL --or 0
Any suggestions?
Just count the BuildingID. The COUNT function does not count nulls so it'll work:
COUNT(BuildingID) OVER (PARTITION BY BuildingID) AS 'Houses in Building'
Note that it assumes that HouseID is not null.
You could simply use a case expression to only show a count where the BuildingID is not null, or you could change your count to be COUNT(BuildingID) rather than COUNT(HouseID) (Since COUNT(NULL) gives 0). Both yield your required results:
DECLARE #Building TABLE (BuildingID INT, HouseID INT);
INSERT #Building (BuildingID, HouseID)
VALUES
(1, 1), (1, 2), (1, 3), (2, 4), (2, 5),
(2, 6), (NULL, 7), (NULL, 8);
SELECT BuildingID,
HouseID,
CountBuildingID = COUNT(BuildingID) OVER (PARTITION BY BuildingID),
CaseExpression = CASE WHEN BuildingID IS NOT NULL THEN COUNT(HouseID) OVER (PARTITION BY BuildingID) END
FROM #Building
ORDER BY HouseID;
OUTPUT
BuildingID HouseID CountBuildingID CaseExpression
-------------------------------------------------------
1 1 3 3
1 2 3 3
1 3 3 3
2 4 3 3
2 5 3 3
2 6 3 3
NULL 7 0 NULL
NULL 8 0 NULL
You can check this following self join option-
WITH your_table (BuildingID, HouseID)
AS
(
SELECT 1, 1 UNION ALL
SELECT 1, 2 UNION ALL
SELECT 1, 3 UNION ALL
SELECT 2, 4 UNION ALL
SELECT 2, 5 UNION ALL
SELECT 2, 6 UNION ALL
SELECT NULL, 7 UNION ALL
SELECT NULL, 8
)
SELECT A.BuildingID,A.HouseID,COUNT(A.BuildingID)Count
FROM your_table A
LEFT JOIN your_table B ON A.BuildingID = B.BuildingID
GROUP BY A.BuildingID,A.HouseID
Output is-
BuildingID HouseID Count
1 1 3
1 2 3
1 3 3
2 4 3
2 5 3
2 6 3
NULL 7 0
NULL 8 0
You can use case when condition in Count function like below,
COUNT(CASE WHEN BuildingID IS NOT NULL THEN HouseID END) OVER (PARTITION BY BuildingID) AS 'Houses in Building'
I have a table that looks similar to this:
WITH
table AS (
SELECT 1 object_id, 234 type_id, 2 type_level UNION ALL
SELECT 1, 23, 1 UNION ALL
SELECT 1, 24, 1 UNION ALL
SELECT 1, 2, 0 UNION ALL
SELECT 1, 2, 0 UNION ALL
SELECT 2, 34, 1 UNION ALL
SELECT 2, 46, 1 UNION ALL
SELECT 2, 465, 2 UNION ALL
SELECT 2, 349, 2 UNION ALL
SELECT 2, 4, 0 UNION ALL
SELECT 2, 3, 0 )
SELECT
object_id,
type_id,
type_level
FROM
table
Now I am trying to create three new columns type_level_0_array,type_level_1_array,type_level_2_array for each object and aggregate the type_id of corresponding level of types into those array (I am not looking for string separated by commas).
So my resultant table should look like the following:
+----+--------------------+--------------------+--------------------+
| id | type_level_0_array | type_level_1_array | type_level_2_array |
+----+--------------------+--------------------+--------------------+
| 1 | 2 | 24,23 | 234 |
+----+--------------------+--------------------+--------------------+
| 2 | 3,4 | 34,46 | 465,349 |
+----+--------------------+--------------------+--------------------+
Is there any way to accomplish that?
Update:
Although it seems that my type_id has certain pattern e.g. level 0 types are of 1 length, level 1 types are of 2 length and so on, in my real dataset there is no such pattern. The identification of level is solely possible by looking at type_level of any row.
Below is for BigQuery Standard SQL
#standardSQL
SELECT object_id,
ARRAY_AGG(DISTINCT IF(type_level = 0, type_id, NULL) IGNORE NULLS) AS type_level_0_array,
ARRAY_AGG(DISTINCT IF(type_level = 1, type_id, NULL) IGNORE NULLS) AS type_level_1_array,
ARRAY_AGG(DISTINCT IF(type_level = 2, type_id, NULL) IGNORE NULLS) AS type_level_2_array
FROM `project.dataset.table`
GROUP BY object_id
You can test, play with above using sample data from your question as in below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 object_id, 234 type_id, 2 type_level UNION ALL
SELECT 1, 23, 1 UNION ALL
SELECT 1, 24, 1 UNION ALL
SELECT 1, 2, 0 UNION ALL
SELECT 1, 2, 0 UNION ALL
SELECT 2, 34, 1 UNION ALL
SELECT 2, 46, 1 UNION ALL
SELECT 2, 465, 2 UNION ALL
SELECT 2, 349, 2 UNION ALL
SELECT 2, 4, 0 UNION ALL
SELECT 2, 3, 0 )
SELECT object_id,
ARRAY_AGG(DISTINCT IF(type_level = 0, type_id, NULL) IGNORE NULLS) AS type_level_0_array,
ARRAY_AGG(DISTINCT IF(type_level = 1, type_id, NULL) IGNORE NULLS) AS type_level_1_array,
ARRAY_AGG(DISTINCT IF(type_level = 2, type_id, NULL) IGNORE NULLS) AS type_level_2_array
FROM `project.dataset.table`
GROUP BY object_id
with result
Row object_id type_level_0_array type_level_1_array type_level_2_array
1 1 2 24 234
23
2 2 4 34 349
3 46 465
Try this. Works for me.
Bigquery won't let you create an array with Nulls in them, which is why the IGNORE NULLS is required.
EDIT: I've updated the code to be based off the type_level column
WITH table
AS (
SELECT 1 object_id, 234 type_id, 2 type_level UNION ALL
SELECT 1, 23, 1 UNION ALL
SELECT 1, 24, 1 UNION ALL
SELECT 1, 2, 0 UNION ALL
SELECT 1, 2, 0 UNION ALL
SELECT 2, 34, 1 UNION ALL
SELECT 2, 46, 1 UNION ALL
SELECT 2, 465, 2 UNION ALL
SELECT 2, 349, 2 UNION ALL
SELECT 2, 4, 0 UNION ALL
SELECT 2, 3, 0 )
SELECT
ARRAY_AGG(CASE WHEN type_level = 0 THEN type_id ELSE NULL END IGNORE NULLS) AS type_level_0_array
, ARRAY_AGG(CASE WHEN type_level = 1 THEN type_id ELSE NULL END IGNORE NULLS) AS type_level_1_array
, ARRAY_AGG(CASE WHEN type_level = 2 THEN type_id ELSE NULL END IGNORE NULLS) AS type_level_2_array
FROM
table
My project is using an Oracle SQL database. I have a historical table that appends task status on a weekly basis, and am attempting to query the number of weeks a task that is currently off track has been off track. Here's an example excerpt from my source historical table:
ID WEEK ON_TRACK
1 1 N
1 2 Y
1 3 N
1 4 N
1 5 N
2 1 N
2 2 N
2 3 Y
2 4 Y
2 5 N
3 1 N
3 2 N
3 3 Y
3 4 Y
3 5 Y
I'm looking to return the count of consecutive "N" values in ON_TRACK starting backwards from the latest append. For the above example data, I'd like the query to return:
ID WKS_OFF_TRACK
1 3
2 1
3 0
I've done some research, and it looks like the Tabibitosan method is the most logical approach, and I've found ample examples to give the max consecutive values that match 1 criteria, but I'm having trouble tweaking to return the most recent consecutive values that match 2 criteria (ID and ON_TRACK).
Here's what I have so far
--this step creates a temp table with unique IDs for each weekly append to the historical table, and a 1 (if ON_TRACK = N) or 0 (if ON_TRACK = Y). This results in the expected info.
WITH HIST_TBL AS (
SELECT DISTINCT(ID),
CASE ON_TRACK
WHEN 'N' THEN 1
ELSE 0
END AS OFF_TRACK,
WEEK
FROM SOURCE_HISTORICAL_TBL
ORDER BY ID,WEEK DESC)
-- end of temp table
--this is where Im struggling I want one line per project number, and the sum of the latest string of 1s (weeks the task has been off track), until a 0 is reached.
SELECT ID,
SUM(OFF_TRACK) AS WKS_OFF_TRACK
FROM (SELECT WEEK,
ID,
OFF_TRACK,
ROW_NUMBER() OVER (ORDER BY WEEK DESC) - ROW_NUMBER() OVER
(PARTITION BY ID,OFF_TRACK ORDER BY WEEK DESC) GRP
FROM HIST_TBL)
GROUP BY ID, GRP
ORDER BY ID;
This code results in the a cumulative sum of all weeks each project has been off track, which for my example data would be:
ID WKS_OFF_TRACK
1 4
2 3
3 2
Any ideas where I'm going wrong?
Here is one method that assumes people were "on track" at some point in time:
select sht.id, count(*)
from SOURCE_HISTORICAL_TBL sht
where sht.week > (select max(sht2.week)
from SOURCE_HISTORICAL_TBL sht2
where sht2.id = sht.id and sht2.on_track = 'Y'
)
group by sht.id;
Otherwise, you need one more condition:
select sht.id, count(*)
from SOURCE_HISTORICAL_TBL sht
where sht.week > (select max(sht2.week)
from SOURCE_HISTORICAL_TBL sht2
where sht2.id = sht.id and sht2.on_track = 'Y'
) or
not exists (select 1
from SOURCE_HISTORICAL_TBL sht2
where sht2.id = sht.id and sht2.on_track = 'Y'
)
group by sht.id;
You can also phrase these as analytic functions:
select id,
sum(case when week > max_week_y or max_week_y is null then 1 else 0 end) as max_off_track
from (select sht.*,
max(case when on_track = 'Y' then week end) over (partition by id) as max_week_y
from SOURCE_HISTORICAL_TBL sht
) sht
group by id;
Note that this version will return 0s for people currently on track.
You can do it in a single table scan:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE SOURCE_HISTORICAL_TBL ( ID, WEEK, ON_TRACK ) AS
SELECT 1, 1, 'N' FROM DUAL UNION ALL
SELECT 1, 2, 'Y' FROM DUAL UNION ALL
SELECT 1, 3, 'N' FROM DUAL UNION ALL
SELECT 1, 4, 'N' FROM DUAL UNION ALL
SELECT 1, 5, 'N' FROM DUAL UNION ALL
SELECT 2, 1, 'N' FROM DUAL UNION ALL
SELECT 2, 2, 'N' FROM DUAL UNION ALL
SELECT 2, 3, 'Y' FROM DUAL UNION ALL
SELECT 2, 4, 'Y' FROM DUAL UNION ALL
SELECT 2, 5, 'N' FROM DUAL UNION ALL
SELECT 3, 1, 'N' FROM DUAL UNION ALL
SELECT 3, 2, 'N' FROM DUAL UNION ALL
SELECT 3, 3, 'Y' FROM DUAL UNION ALL
SELECT 3, 4, 'Y' FROM DUAL UNION ALL
SELECT 3, 5, 'Y' FROM DUAL UNION ALL
SELECT 4, 1, 'N' FROM DUAL UNION ALL
SELECT 5, 1, 'Y' FROM DUAL;
Query 1:
SELECT ID,
GREATEST(
COALESCE( MAX( CASE ON_TRACK WHEN 'N' THEN WEEK END ), 0 )
- COALESCE( MAX( CASE ON_TRACK WHEN 'Y' THEN WEEK END ), 0 ),
0
) AS weeks
FROM SOURCE_HISTORICAL_TBL
GROUP BY id
ORDER BY id
Results:
| ID | WEEKS |
|----|-------|
| 1 | 3 |
| 2 | 1 |
| 3 | 0 |
| 4 | 1 |
| 5 | 0 |
So I am trying to pull rows from a table where there are more than one version for an ID that has at least one person for the ID that is not null but the versions that come after it are null.
So, if i had a statement like:
select ID, version, person from table1
the output would be:
ID Version Person
-- ------- ------
1 1 Tom
1 2 null
1 3 null
2 1 null
2 2 null
2 3 null
3 1 Mary
3 2 Mary
4 1 Joseph
4 2 null
4 3 Samantha
The version number can have an infinite value and is not limited.
I want to pull ID 1 version 2/3, and ID 4 Version 2.
So in the case of ID 2 where the person is null for all three rows I don't need these rows. And in the case of ID 3 version 1 and 2 I don't need these rows because there is never a null value.
This is a very simple version of the table I am working with but the "real" table is a lot more complicated with a bunch of joins already in it.
The desired output would be:
ID Version Person
-- ------- ------
1 2 null
1 3 null
4 2 null
The result set that I am looking for is where in a previous version for the same ID there was a person listed but is now null.
You are seeking all rows where the person is not null and that id has null rows, and the not null person version is less than the null version for the same person id:
Edited predicate based on comment
with sample_data as
(select 1 id, 1 version, 'Tom' person from dual union all
select 1, 2, null from dual union all
select 1, 3, null from dual union all
select 2, 1, null from dual union all
select 2, 2, null from dual union all
select 2, 3, null from dual union all
select 3, 1, 'Mary' from dual union all
select 3, 2, 'Mary' from dual union all
select 4, 1, 'Joseph' from dual union all
select 4, 2, null from dual union all
select 4, 3, 'Samantha' from dual)
select *
from sample_data sd
where person is null
and exists
(select 1 from sample_data
where id = sd.id
and person is not null
and version < sd.version);
/* Old predicate
and id in
(select id from sample_data where person is not null);
*/
I think this query translates pretty nicely into what you asked for?
List all the rows (R) where the person is null, but only if a previous row (P) with a non-null name exists.
select *
from table1 r
where r.person is null
and exists(
select 'x'
from table1 p
where p.id = r.id
and p.version < r.version
and p.person is not null
);
I believe the below should work.
select ID, listagg(version, ', ') within group (order by version) as versions
from table1 t1
where 0 < (select count(*) from table1 t1A where t1A.ID = t1.ID and t1A.version is not null)
and 0 < (select count(*) from table1 t1B where t1B.ID = t1.ID and t1B.version is null)
and person is null
group by ID
This should do what you want:
select id, version, person
from
(
select id, version, person,
lag(person, 1) ignore nulls
over (partition by id
order by version) as x
from table1
) dt
where person is null
and x is not null
I have a set of data as below, showing the history of who has done what with a record. The unique identifier for each record is shown in 'ID' and 'Rec No' is the sequential number assigned to each interaction with the record.
ID Rec No Who Type
1 1 Bob New
1 2 Bob Open
1 3 Bob Assign
1 4 Sarah Add
1 5 Bob Add
1 6 Bob Close
2 1 John New
2 2 John Open
2 3 John Assign
2 4 Bob Assign
2 5 Sarah Add
2 6 Sarah Close
3 1 Sarah New
3 2 Sarah Open
3 3 Sarah Assign
3 4 Sarah Close
I need to find all of the 'Assign' operations. However where multiple 'Assign' are in a certain ID, I want to find the first one. I then also want to find the name of the person who did that.
So ultimately from the above date I would like the output to be-
Who Count (assign)
Bob 1
John 1
Sarah 1
The code I have at the moment is-
SELECT IH.WHO, Count(IH.ID)
FROM Table.INCIDENTS_H IH
WHERE (IH.TYPE = Assign)
GROUP BY IH.WHO
But this gives the output as-
Who Count (assign)
Bob 2
John 1
Sarah 1
As it is finding that Bob did an assign on ID 2, Rec No 4.
Any help would be appreciated. I am using MS SQL.
I think something like this is what you are after:
select
who, count(id)
from (
select ID, Who, row_number() over (partition by ID order by Rec) [rownum]
from Table.INCIDENTS_H IH
WHERE (IH.TYPE = Assign)
) a
where rownum = 1
group by who
This should count only the first Assign (ordered by Rec) within each ID group.
This ought to do it:
SELECT IH.WHO, COUNT(IH.ID)
FROM INCIDENTS_H IH
JOIN (
SELECT ID, MIN([Rec No]) [Rec No]
FROM INCIDENTS_H
WHERE ([Type] = 'Assign')
GROUP BY ID
) IH2
ON IH2.ID = IH.ID AND IH2.[Rec No] = IH.[Rec No]
GROUP BY IH.WHO
You can use row_number to accomplish this
WITH INCIDENTS_H as (
SELECT
1 as ID, 1 as RecNo, 'Bob' as Who, 'New' as type
UNION ALL SELECT 1, 2, 'Bob','Open'
UNION ALL SELECT 1, 3, 'Bob','Assign'
UNION ALL SELECT 1, 4, 'Sarah','Add'
UNION ALL SELECT 1, 5, 'Bob','Add'
UNION ALL SELECT 1, 6, 'Bob','Close'
UNION ALL SELECT 2, 1, 'John','New'
UNION ALL SELECT 2, 2, 'John','Open'
UNION ALL SELECT 2, 3, 'John','Assign'
UNION ALL SELECT 2, 4, 'Bob','Assign'
UNION ALL SELECT 2, 5, 'Sarah','Add'
UNION ALL SELECT 2, 6, 'Sarah','Close'
UNION ALL SELECT 3, 1, 'Sarah','New'
UNION ALL SELECT 3, 2, 'Sarah','Open'
UNION ALL SELECT 3, 3, 'Sarah','Assign'
UNION ALL SELECT 3, 4, 'Sarah','Close')
, GetTheMin AS (
SELECT
ROW_NUMBER() over (partition by id order by recno) row,
ID,
RecNo,
Who,
type
FROM
INCIDENTS_H
WHERE
type = 'Assign'
)
SELECT Who,
COUNT(ID)
FROM GetTheMin
WHERE
row = 1
GROUP BY
who
OR you can use CROSS Apply
SELECT
who,
COUNT(id) id
FROM
(SELECT DISTINCT
MinValues.*
FROM
INCIDENTS_H h
CROSS APPLY ( SELECT TOP 1 *
FROM INCIDENTS_H h2
WHERE h.id = h2.id
ORDER BY ID, RecNo asc) MinValues) getTheMin
GROUP BY WHO
Or you can use Min which uses standard SQL John Fisher's answer demonstrates
Here's a view of everything in the table which should match your "first assign" requirement:
select a.*
from Table.INCIDENTS_H a
inner join
(select ID, min([Rec No]) [Rec No] from Table.INCIDENTS_H where Type = 'Assign' group by ID) b
on a.ID = b.ID and a.[Rec No] = b.[Rec No]
Result:
ID Rec No Who Type
1 3 Bob Assign
2 3 John Assign
3 3 Sarah Assign
select * from
(select
id, rec_no, who
from
operation_history
where
type = 'Assign'
order by rec_no asc) table_alias
group by
id
order by id asc
Tested and here are the results:
id rec_no who
1 3 Bob
2 3 John
3 3 Sarah
(Code not specific to SQL Server)
Here is the query with virtual test data that were mentioned in the original post:
with T (ID, RecNo, Who, Type) as
(
select 1, 1, 'Bob', 'New' union all
select 1, 2, 'Bob', 'open' union all
select 1, 3, 'Bob', 'Assign' union all
select 1, 4, 'Sarah', 'Add' union all
select 1, 5, 'Bob', 'Add' union all
select 1, 6, 'Bob', 'Close' union all
select 2, 1, 'John', 'New' union all
select 2, 2, 'John', 'Open' union all
select 2, 3, 'John', 'Assign' union all
select 2, 4, 'Bob', 'Assign' union all
select 2, 5, 'Sarah', 'Add' union all
select 2, 6, 'Sarah', 'Close' union all
select 3, 1, 'Sarah', 'New' union all
select 3, 2, 'Sarah', 'Open' union all
select 3, 3, 'Sarah', 'Assign' union all
select 3, 4, 'Sarah', 'Close'
)
select top 1 with ties *
from T
where Type = 'Assign'
order by row_number() over(partition by ID order by RecNo)
The "select" statement that can be applied to the real situation from the question might look like:
SELECT TOP 1 WITH TIES
IH.ID, IH.[Rec No], IH.WHO, IH.TYPE
FROM Table.INCIDENTS_H IH
WHERE IH.TYPE = 'Assign'
ORDER BY ROW_NUMBER() OVER(PARTITION BY IH.ID ORDER BY IH.[Rec No]);