I have a table that looks like below:
Date User Product
11/15/2019 123 NULL
11/21/2019 123 A
11/21/2019 123 A
11/23/2019 123 B
I want to run a dense_rank function that will skip the null values.
Below is what I currently have:
CASE WHEN PRODUCT IS NOT NULL
THEN DENSE_RANK()
OVER (PARTITION BY USER ORDER BY DATE ASC)
ELSE 1
END DENSE_RANK_OUTPUT
My current output:
Date User Product DENSE_RANK_OUTPUT
11/15/2019 123 NULL 1
11/21/2019 123 A 2
11/21/2019 123 A 2
11/23/2019 123 B 3
My desired output is:
Date User Product DESIRED_OUTPUT
11/15/2019 123 NULL 1
11/21/2019 123 A 1
11/21/2019 123 A 1
11/23/2019 123 B 2
You are close. Just use another key in the partition by:
(CASE WHEN PRODUCT IS NOT NULL
THEN DENSE_RANK() OVER (PARTITION BY USER, (PRODUCT IS NOT NULL) ORDER BY DATE ASC)
ELSE 1
END) as DENSE_RANK_OUTPUT
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 months ago.
Improve this question
I have a table Cards(card_id,status,cid)
With the columns:
cid - customer id
status - exp/vld
card_id - card id's
How to find the cid with the most expired cards?
From Oracle 12, you can use:
SELECT cid,
COUNT(*) AS num_exp
FROM cards
WHERE status = 'exp'
GROUP BY cid
ORDER BY num_exp DESC
FETCH FIRST ROW WITH TIES;
You can get count of expired cards for individual customers and then choose customer with MAX count. The below query should give results.
WITH t AS(
SELECT cid, count(1) customer_exp_cards_count
FROM Cards where status = 'exp'
group by cid)
SELECT cid FROM t t1
WHERE t1.customer_exp_cards_count IN (SELECT MAX(t2.customer_exp_cards_count)
FROM t t2)
Sample data and its result:
cardid status cid
3 exp 5
1 exp 1
2 exp 1
3 vld 1
5 vld 1
1 exp 2
2 exp 2
3 exp 2
4 vld 2
5 vld 2
6 exp 3
7 vld 4
4 vld 5
Result:
2
Suppose you have these two tables (just a sample data)
CUSTOMERS
CUST_ID
CUST_NAME
CUST_STATUS
101
John
ACTIVE
102
Annie
ACTIVE
103
Jane
ACTIVE
104
Bob
INACTIVE
CARDS
CARD_ID
CARD_STATUS
CUST_ID
1001001
VALID
101
1001002
VALID
101
1001003
EXPIRED
101
1001004
EXPIRED
101
1001005
VALID
101
1002010
VALID
102
1002020
EXPIRED
102
1002030
EXPIRED
102
1002040
EXPIRED
102
1003100
VALID
103
1003200
VALID
103
If you want just a CUST_ID with the number of most expired cards you can do it without table CUSTOMERS:
Select CUST_ID, EXPIRED_CARDS
From (Select CUST_ID, Count(CARD_ID) "EXPIRED_CARDS" From cards Where CARD_STATUS = 'EXPIRED' Group By CUST_ID)
Where EXPIRED_CARDS = (Select Max(EXPIRED_CARDS) From (Select Count(CARD_ID) "EXPIRED_CARDS" From cards Where CARD_STATUS = 'EXPIRED' Group By CUST_ID) )
--
-- R e s u l t
-- CUST_ID EXPIRED_CARDS
-- ---------- -------------
-- 102 3
Maybe you could consider creating a CTE with the data from both tables which will give you dataset that you could use later for different questions not just for this one. Something like this:
WITH
customers_cards AS
(
Select
cst.CUST_ID,
cst.CUST_NAME,
cst.CUST_STATUS,
crd.CARD_ID,
crd.CARD_STATUS,
Sum(CASE WHEN crd.CUST_ID Is Null Then 0 Else 1 End) OVER(Partition By crd.CUST_ID) "TOTAL_NUM_OF_CARDS",
Sum(CASE WHEN crd.CARD_ID Is Null Then Null WHEN crd.CARD_STATUS = 'VALID' And crd.CARD_ID Is Not Null Then 1 Else 0 End) OVER(Partition By crd.CUST_ID) "VALID_CARDS",
Sum(CASE WHEN crd.CARD_ID Is Null Then Null WHEN crd.CARD_STATUS = 'EXPIRED' And crd.CARD_ID Is Not Null Then 1 Else 0 End) OVER(Partition By crd.CUST_ID) "EXPIRED_CARDS"
From
customers cst
Left Join
cards crd on(crd.CUST_ID = cst.CUST_ID)
)
/* R e s u l t :
CUST_ID CUST_NAME CUST_STATUS CARD_ID CARD_STATUS TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
---------- --------- ----------- ------- ----------- ------------------ ----------- -------------
101 John ACTIVE 1001001 VALID 5 3 2
101 John ACTIVE 1001002 VALID 5 3 2
101 John ACTIVE 1001003 EXPIRED 5 3 2
101 John ACTIVE 1001004 EXPIRED 5 3 2
101 John ACTIVE 1001005 VALID 5 3 2
102 Annie ACTIVE 1002010 VALID 4 1 3
102 Annie ACTIVE 1002040 EXPIRED 4 1 3
102 Annie ACTIVE 1002030 EXPIRED 4 1 3
102 Annie ACTIVE 1002020 EXPIRED 4 1 3
103 Jane ACTIVE 1003100 VALID 2 2 0
103 Jane ACTIVE 1003200 VALID 2 2 0
104 Bob INACTIVE 0
*/
This can be used to answer many more potential questions. Here is the list of customers sorted by number of expired cards (descending):
Select Distinct
CUST_ID, CUST_NAME, TOTAL_NUM_OF_CARDS, VALID_CARDS, EXPIRED_CARDS
From
customers_cards
Order By
EXPIRED_CARDS Desc Nulls Last, CUST_ID
--
-- R e s u l t :
-- CUST_ID CUST_NAME TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
-- ---------- --------- ------------------ ----------- -------------
-- 102 Annie 4 1 3
-- 101 John 5 3 2
-- 103 Jane 2 2 0
-- 104 Bob 0
OR to answer your question:
Select Distinct
CUST_ID, CUST_NAME, TOTAL_NUM_OF_CARDS, VALID_CARDS, EXPIRED_CARDS
From
customers_cards
Where
EXPIRED_CARDS = (Select Max(EXPIRED_CARDS) From customers_cards)
Order By
CUST_ID
--
-- R e s u l t :
-- CUST_ID CUST_NAME TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
-- ---------- --------- ------------------ ----------- -------------
-- 102 Annie 4 1 3
Regards...
Here is the sample data from the employee vacation table.
Emp_id Vacation_Start_Date Vacation_End_Date Public_Hday
1234 06/01/2022 06/07/2022 null
1234 06/08/2022 06/14/2022 null
1234 06/15/2022 06/19/2022 06/17/2022
1234 06/20/2022 06/23/2022 null
1234 06/24/2022 06/28/2022 null
1234 06/29/2022 07/02/2022 06/30/2022
1234 07/03/2022 07/07/2022 null
1234 07/08/2022 07/12/2022 null
1234 07/13/2022 07/17/2022 07/15/2022
1234 07/18/2022 07/22/2022 null
I want to group these vacations based on the public holidays in between (Assuming that all the vacations are consecutive). Here is the output that I am trying to get.
Emp_id Vacation_Start_Date Vacation_End_Date Public_Hday Group
1234 06/01/2022 06/07/2022 null 0
1234 06/08/2022 06/14/2022 null 0
1234 06/15/2022 06/19/2022 06/17/2022 1
1234 06/20/2022 06/23/2022 null 1
1234 06/24/2022 06/28/2022 null 1
1234 06/29/2022 07/02/2022 06/30/2022 2
1234 07/03/2022 07/07/2022 null 2
1234 07/08/2022 07/12/2022 null 2
1234 07/13/2022 07/17/2022 07/15/2022 3
1234 07/18/2022 07/22/2022 null 3
Here is the code that I tried
Select *, dense_rank() over (partition by Emp_id order by Public_Hday) - 1 AS Group from Emp_Vacation.
But, it gave the expected group values only to the vacations where the Public_Hday is not null. How do I get the group values to the other vacations.
You can use a conditional sum() over()
Select *
,Grp = sum( case when [Public_Hday] is null then 0 else 1 end ) over (partition by [Emp_id] order by [Vacation_Start_Date])
from YourTable
Results
I have a table with user shopping data as shown below
I want an output similar to running total but instead I want the running total of the count of unique categories that the user has shopped for by date.
I know I have to make use of ROWS PRECEDING AND FOLLOWING in the count function but I am not able to user count(distinct category) in a window function
Dt category userId
4/10/2022 Grocery 123
4/11/2022 Grocery 123
4/12/2022 MISC 123
4/13/2022 SERVICES 123
4/14/2022 RETAIl 123
4/15/2022 TRANSP 123
4/20/2022 GROCERY 123
Desired output
Dt userID number of unique categories
4/10/2022 123 1
4/11/2022 123 1
4/12/2022 123 2
4/13/2022 123 3
4/14/2022 123 4
4/15/2022 123 5
4/20/2022 123 5
Consider below approach
select Dt, userId,
( select count(distinct category)
from t.categories as category
) number_of_unique_categories
from (
select *, array_agg(lower(category)) over(partition by userId order by Dt) categories
from your_table
) t
if applied to sample data in your question - output is
I'm trying to match and align data, or resaid, count occurrences and then list for which values those occurrences occur.
Or, in a question: "How many times does each ID value occur, and for what names?"
For example, with this input
Name ID
-------------
jim 123
jim 234
jim 345
john 123
john 345
jane 234
jane 345
jan 45678
I want the output to be:
count ID name name name
------------------------------------
3 345 jim john jane
2 123 jim john
2 234 jim jane
1 45678 jan
Or similarly, the input could be (noticing that the ID values are not aligned),
jim john jane jan
----------------------------
123 345 234 45678
234 123 345
345
but that seems to complicate things.
As close as I am to the desired results is in SQL, as
for ID, count(ID)
from table
group by (ID)
order by count desc
which outputs
ID count
------------
345 3
123 2
234 2
45678 1
I'll appreciate help.
You seem to want a pivot. In SQL, you have to specify the number of columns in advance (unless you construct the query as a string).
But the idea is:
select ID, count(*) as cnt,
max(case when seqnum = 1 then name end) as name_1,
max(case when seqnum = 2 then name end) as name_2,
max(case when seqnum = 3 then name end) as name_3
from (select t.*,
row_number() over (partition by id order by id) as seqnum -- arbitrary ordering
from table t
) t
group by ID
order by count desc;
If you have an unknown number of columns, you can aggregate the values into an array:
select ID, count(*) as cnt,
array_agg(name order by name) as names
from table t
group by ID
order by count desc
the query would look similar to this if that's what you're looking for.
SELECT
name,
id,
COUNT(id) as count
FROM
dataSet
WHERE
dataSet.name = 'input'
AND dataSet.id = 'input'
GROUP BY
name,
id
i have a cros tab query in MS Access which i want to replicate in T-SQL,
T-SQL table '#tmpZSPO_DMD' has Part,Location, Qty,FiscalMonthPeriod. and when i run, the data looks like below.
Part LOCATION Qty FiscalMonthPeriod
123 4040_0086 1 CON00
123 4040_0086 1 CON00
123 4200_0010 1 CON00
123 2070_0060 2 CON01
123 2080_0061 1 CON01
123 4040_0070 1 CON02
123 4040_0070 2 CON02
123 4040_0086 1 CON02
123 2020_0060 2 CON03
123 2020_0064 1 CON03
123 2040_0060 1 CON03
123 4040_0061 1 CON03
123 4040_0061 1 CON03
123 4040_0069 1 CON03
123 4040_0070 1 CON03
I am looking to achieve the below result.
Part LOCATION CON00 CON01 CON02 CON03
123 2020_0060 2
123 2020_0064 1
123 2040_0060 1
123 2070_0060 2
123 2080_0061 1
123 4040_0061 2
123 4040_0069 1
123 4040_0070 3 1
123 4040_0086 2 1
123 4200_0010 1
A very simple PIVOT will do the job.
SELECT *
FROM
(
SELECT Part, LOCATION, Qty, FiscalMonthPeriod
FROM #Table
) t
PIVOT
(
SUM(Qty)
FOR FiscalMonthPeriod IN ([CON00], [CON01], [CON02], [CON03])
) p
Select Part
,LOCATION
,ISNULL(CON00 , 0) AS CON00
,ISNULL(CON01 , 0) AS CON01
,ISNULL(CON02 , 0) AS CON02
,ISNULL(CON03 , 0) AS CON03
FROM tablename T
PIVOT (SUM(Qty)
FOR FiscalMonthPeriod
IN(CON00 , CON01, CON02, CON03)
)p
SELECT tm.PART, tm.Location,
SUM(IIF(tm.FiscalMonthPeriod= 'CON00', [Sum], NULL)) As CON00,
SUM(IIF(tm.FiscalMonthPeriod= 'CON01', [Sum], NULL)) As CON01,
SUM(IIF(tm.FiscalMonthPeriod= 'CON02', [Sum], NULL)) As CON02,
SUM(IIF(tm.FiscalMonthPeriod= 'CON03', [Sum], NULL)) As CON03
FROM #tmpZSPO_DMD tm
GROUP BY tm.PART, tm.Location;