TSQL - Picking up first match from a group of rows - sql

I have a simple scenario wherein, a table stores data about which card(s) a users uses and if those cards are registered (exist) in the system. I've applied ROW_NUMBER to group them too
SELECT User, CardId, CardExists, ROW_NUMBER() OVER (PARTITION BY User) AS RowNum From dbo.CardsInfo
User | CardID | CardExists | RowNum
-------------------------------------
A | 1 | 0 | 1
A | 2 | 1 | 2
A | 3 | 1 | 3
---------------------------------
B | 4 | 0 | 1
B | 5 | 0 | 2
B | 6 | 0 | 3
B | 7 | 0 | 4
---------------------------------
C | 8 | 1 | 1
C | 9 | 0 | 2
C | 10 | 1 | 3
Now in the above, I need to filter out User cards based on the two rules below
If in the cards registered with a user, multiple cards exist in the system, then take first one. So, for user A, CardID 2 will be returned and for User C it'll return CardID = 8
Othwerwise, if no card is existing (registered) for the user in the system, then just take the first one. So, for user B, it should return CardID = 4
Thus, final returned set should be -
User | CardID | CardExists | RowNum
-------------------------------------
A | 2 | 1 | 2
---------------------------------
B | 4 | 0 | 1
---------------------------------
C | 8 | 1 | 1
How can I do this filteration in SQL?
Thanks

You can use:
SELECT ci.*
FROM (SELECT User, CardId, CardExists,
ROW_NUMBER() OVER (PARTITION BY User ORDER BY CardExists DESC, CardId) AS RowNum
FROM dbo.CardsInfo ci
) ci
WHERE seqnum = 1;
You can also do this with aggregation:
select user,
max(cardexists) as cardexists,
coalesce(min(case when cardexists = 1 then cardid end),
min(card(cardid)
) as cardid
from cardsinfo
group by user;
Or, if you have a separate users table:
select ci.*
from users u cross apply
(select top (1) ci.*
from cardinfo ci
where ci.user = u.user
order by ci.cardexists desc, cardid asc
) ci

Related

Postgres - Unique values for id column using CTE, Joins alongside GROUP BY

I have a table referrals:
id | user_id_owner | firstname | is_active | user_type | referred_at
----+---------------+-----------+-----------+-----------+-------------
3 | 2 | c | t | agent | 3
5 | 3 | e | f | customer | 5
4 | 1 | d | t | agent | 4
2 | 1 | b | f | agent | 2
1 | 1 | a | t | agent | 1
And another table activations
id | user_id_owner | referral_id | amount_earned | activated_at | app_id
----+---------------+-------------+---------------+--------------+--------
2 | 2 | 3 | 3.0 | 3 | a
4 | 1 | 1 | 6.0 | 5 | b
5 | 4 | 4 | 3.0 | 6 | c
1 | 1 | 2 | 2.0 | 2 | b
3 | 1 | 2 | 5.0 | 4 | b
6 | 1 | 2 | 7.0 | 8 | a
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Here is the query I ran:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select id, app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id )
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
Here is the result I got:
id | activations_count | amount_earned | referred_at | last_activated_at | id | best_selling_app | best_selling_app_count | best_selling_app_rank
----+-------------------+---------------+-------------+-------------------+----+------------------+------------------------+-----------------------
2 | 3 | 14.0 | 2 | 8 | 2 | b | 2 | 1
1 | 1 | 6.0 | 1 | 5 | 1 | b | 1 | 2
2 | 3 | 14.0 | 2 | 8 | 2 | a | 1 | 2
4 | 1 | 3.0 | 4 | 6 | 4 | c | 1 | 2
The problem with this result is that the table has a duplicate id of 2. I only need unique values for the id column.
I tried a workaround by harnessing distinct that gave desired result but I fear the query results may not be reliable and consistent.
Here is the workaround query:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select
distinct on(id), app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id
order by id, best_selling_app_count desc)
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
I need a recommendation on how best to achieve this.
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Your question is really complicated with a very complicated SQL query. However, the above is what looks like the actual question. If so, you can use:
select r.*,
a.app_id as most_common_app_id,
a.cnt as most_common_app_id_count
from referrals r left join
(select distinct on (a.referral_id) a.referral_id, a.app_id, count(*) as cnt
from activations a
group by a.referral_id, a.app_id
order by a.referral_id, count(*) desc
) a
on a.referral_id = r.id;
You have not explained the other columns that are in your result set.

Get some values from the table by selecting

I have a table:
| id | Number |Address
| -----| ------------|-----------
| 1 | 0 | NULL
| 1 | 1 | NULL
| 1 | 2 | 50
| 1 | 3 | NULL
| 2 | 0 | 10
| 3 | 1 | 30
| 3 | 2 | 20
| 3 | 3 | 20
| 4 | 0 | 75
| 4 | 1 | 22
| 4 | 2 | 30
| 5 | 0 | NULL
I need to get: the NUMBER of the last ADDRESS change for each ID.
I wrote this select:
select dh.id, dh.number from table dh where dh =
(select max(min(t.history)) from table t where t.id = dh.id group by t.address)
But this select not correctly handling the case when the address first changed, and then changed to the previous value. For example id=1: group by return:
| Number |
| -------- |
| NULL |
| 50 |
I have been thinking about this select for several days, and I will be happy to receive any help.
You can do this using row_number() -- twice:
select t.id, min(number)
from (select t.*,
row_number() over (partition by id order by number desc) as seqnum1,
row_number() over (partition by id, address order by number desc) as seqnum2
from t
) t
where seqnum1 = seqnum2
group by id;
What this does is enumerate the rows by number in descending order:
Once per id.
Once per id and address.
These values are the same only when the value is 1, which is the most recent address in the data. Then aggregation pulls back the earliest row in this group.
I answered my question myself, if anyone needs it, my solution:
select * from table dh1 where dh1.number = (
select max(x.number)
from (
select
dh2.id, dh2.number, dh2.address, lag(dh2.address) over(order by dh2.number asc) as prev
from table dh2 where dh1.id=dh2.id
) x
where NVL(x.address, 0) <> NVL(x.prev, 0)
);

Group By with MAX value from another column

Table FieldStudies is :
ID Name
---|-----------------------|
1 | Industrial Engineering|
2 | Civil Engineering |
3 | Architecture |
4 | Chemistry |
And table Eductionals is :
ID UserID Degree FieldStudy_ID
---|------|--------|------------|
1 | 100 | 3 | 4 |
2 | 101 | 2 | 2 |
3 | 101 | 3 | 2 |
4 | 101 | 4 | 3 |
5 | 103 | 3 | 4 |
6 | 103 | 4 | 2 |
I want to find the number of students in each FieldStudies , provided that the highest Degree is considered.
Output desired:
ID Name Count
---|-----------------------|--------|
1 | Industrial Engineering| 0 |
2 | Civil Engineering | 0 |
3 | Architecture | 1 |
4 | Chemistry | 2 |
I have tried:
select Temptable2.* , count(*) As CountField from
(select fs.*
from FieldStudies fs
left outer join
(select e.UserID , Max(e.Degree) As ID_Degree , e.FieldStudy_ID
from Eductionals e
group by e.UserID) Temptable
ON fs.ID = Temptable.FieldStudy_ID) Temptable2
group by Temptable2.ID
But I get the following error :
Column 'Eductionals.FieldStudy_ID' is invalid in the select list
because it is not contained in either an aggregate function or the
GROUP BY clause.
If I understand correctly, you want only the highest degree for each person. If so, you can use row_number() to whittle down the multiple rows for a given person and the rest is aggregation and join:
select fs.id, fs.Name, count(e.id)
from fieldstudies fs left join
(select e.*,
row_number() over (partition by userid order by degree desc) as seqnum
from educationals e
) e
on e.FieldStudy_ID = fs.id and seqnum = 1
group by fs.id, fs.Name
order by fs.id;

SQL count referrals for each user

My query:
SELECT COUNT(referrer) as refs, SUM(amount) as total, contracts.id, userid, fine
FROM contracts
JOIN users ON contracts.userid = users.id
WHERE active = 1
GROUP BY userid
my users table :
id | username | referrer (int)
1 | test | 2
2 | drekorig |
3 | maximili | 2
my contracts table:
id ! userid | amount | fine | active
1 | 1 | 50 | 23/10/2018 | 1
2 ! 2 | 120 | 24/10/2018 | 1
3 | 2 | 150 | 24/10/2018 | 1
How do I get the count of referrals for each User? My query actually gets the number of contracts instead...
Expected result:
refs | total | id | userid | fine
0 | 0 | 1 | 1 | 23/10/2018
2 | 270 | 2 | 2 | 24/10/2018
http://sqlfiddle.com/#!9/0a464d/5
SELECT r.count as refs,
SUM(amount) as total,
MAX(c.id),
u.id,
MAX(fine)
FROM users u
LEFT JOIN
(SELECT referrer, COUNT(*) `count`
FROM users
GROUP BY referrer
) r
ON u.id = r.referrer
JOIN contracts c
ON c.userid = u.id
WHERE active = 1
GROUP BY u.id

Efficient ROW_NUMBER increment when column matches value

I'm trying to find an efficient way to derive the column Expected below from only Id and State. What I want is for the number Expected to increase each time State is 0 (ordered by Id).
+----+-------+----------+
| Id | State | Expected |
+----+-------+----------+
| 1 | 0 | 1 |
| 2 | 1 | 1 |
| 3 | 0 | 2 |
| 4 | 1 | 2 |
| 5 | 4 | 2 |
| 6 | 2 | 2 |
| 7 | 3 | 2 |
| 8 | 0 | 3 |
| 9 | 5 | 3 |
| 10 | 3 | 3 |
| 11 | 1 | 3 |
+----+-------+----------+
I have managed to accomplish this with the following SQL, but the execution time is very poor when the data set is large:
WITH Groups AS
(
SELECT Id, ROW_NUMBER() OVER (ORDER BY Id) AS GroupId FROM tblState WHERE State=0
)
SELECT S.Id, S.[State], S.Expected, G.GroupId FROM tblState S
OUTER APPLY (SELECT TOP 1 GroupId FROM Groups WHERE Groups.Id <= S.Id ORDER BY Id DESC) G
Is there a simpler and more efficient way to produce this result? (In SQL Server 2012 or later)
Just use a cumulative sum:
select s.*,
sum(case when state = 0 then 1 else 0 end) over (order by id) as expected
from tblState s;
Other method uses subquery :
select *,
(select count(*)
from table t1
where t1.id < t.id and state = 0
) as expected
from table t;