Calculate variable of max amount in a group - sql

I have difficulties in doing the following exercise. I would need to find how frequent is that an id is not the max_id in the group with the most amount. This should be done considering groups that contain at least two different people.
Data comes from two different tables: max_id comes from table1 (I will call it a)as well as user and amount; id comes from table2 (b) as well as group.
From the text above, the conditions should be
(1) a.id<>b.max_id /* is not */
(2) people in group >=2
(3) a.id<> id of max amount
The dataset looks like
(a)
max_id user amount
(b)
group email
From a previous exercise, I had to compute distinct people as follows:
sel a.distinct users
a.max_id
b.id
from table1 as a
inner join table2 as b
on b.id=a.max_id
where
b.max_id is not null
and b.time is null
No information from amount was required in the exercise above. This is the main difference between the two exercises, but the structure and fields are quite similar.
Now, I would need to edit the code above in order to find how frequent is that an id is not the max_id in the group with the most amount. This makes sense only if groups have at least two different persons/users.
I think I will need to join tables to get the id of max amount in a group and count people in a group, but I do not know how to do it.
Any help would be greatly appreciated. Thank you.
Data sample
max_id user amount id group email
12 1 -2000 12 house email1
312 1 0 54 work email1
11 32 -213 11 house email32
41 13 -43 78 work email13
312 53 -650 34 work email53
1 67 -532 43 defense email67
64 76 -9650 98 work email76
For my understanding, what the exercise asks and based on the code above, I should find values for id<>max_id and having more than 2 users in a group (i.e. house, work, defence).
Then, what I would need to select is id <> id of max amount.
I hope this it can be a bit more clear.

assuming yoy have a query as
select t.User, m.Email, m.Model, m.Amount
from my_table m
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
you can obatin the max di for each amoun using
select max(id), Amount
from (
select m.id, t.User, m.Email, m.Model, m.Amount
from my_table m
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
) k
and you should obtain the valud of id that are not equal to max id as
select mm.id, t.User, mm.Email, mm.Model, mm.Amount
from my_table mm
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
inner join (
select max(k.id) max_id, k.Amount
from (
select m.id, t.User, m.Email, m.Model, m.Amount
from my_table m
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
) k
) kk ON kk.max_id <> mm.id
and based on your last sample the query should be
select m.*
from my_table
inner join (
select my_groups, count(distinct user)
from my_table
group by my_groups
having count(distinct user) >2
) t on t.my_group = m.my_group
and m.max_id <> m.id
PS group is a reserved word so i use my_groups for the column name

Related

Issue with getting the rank of a user based on combined columns in a join table

I have a users table and each user has flights in a flights table. Each flight has a departure and an arrival airport relationship within an airports table. What I need to do is count up the unique airports across both departure and arrival columns (flights.departure_airport_id and flights.arrival_airport_id) for each user, and then assign them a rank via dense_rank and then retrieve the rank for a given user id.
Basically, I need to order all users according to how many unique airports they have flown to or from and then get the rank for a certain user.
Here's what I have so far:
SELECT u.rank FROM (
SELECT
users.id,
dense_rank () OVER (ORDER BY count(DISTINCT (flights.departure_airport_id, flights.arrival_airport_id)) DESC) AS rank
FROM users
LEFT JOIN flights ON users.id = flights.user_id
GROUP BY users.id
) AS u WHERE u.id = 'uuid';
This works, but does not actually return the desired result as count(DISTINCT (flights.departure_airport_id, flights.arrival_airport_id)) counts the combined airport ids and not each unique airport id separately. That's how I understand it works, anyway... I'm guessing that I somehow need to use a UNION join on the airport id columns but can't figure out how to do that.
I'm on Postgres 13.0.
I would recommend a lateral join to unpivot, then aggregation and ranking:
select *
from (
select f.user_id,
dense_rank() over(order by count(distinct a.airport_id) desc) rn
from flights f
cross join lateral (values
(f.departure_airport_id), (f.arrival_airport_id)
) a(airport_id)
group by f.user_id
) t
where user_id = 'uuid'
You don't really need the users table for what you want, unless you do want to allow users without any flight (they would all have the same, highest rank). If so:
select *
from (
select u.id,
dense_rank() over(order by count(distinct a.airport_id) desc) rn
from users u
left join flights f on f.user_id = u.id
left join lateral (values
(f.departure_airport_id), (f.arrival_airport_id)
) a(airport_id) on true
group by u.id
) t
where id = 'uuid'
You're counting the distinct pairs of (departure_airport_id, arrival_airpot_id). As you suggested, you could use union to get a single column of airport IDs (regardless of whether they are departure or arrival airports), and then apply a count on them:
SELECT user_id, DENSE_RANK() OVER (ORDER BY cnt DESC) AS user_rank
FROM (SELECT u.id AS user_id, COALESCE(cnt, 0) AS cnt
FROM users u
LEFT JOIN (SELECT user_id, COUNT DISTINCT(airport_id) AS cnt
FROM (SELECT user_id, departure_airport_id AS airport_id
FROM flights
UNION
SELECT user_id, arrival_airport_id AS airport_id
FROM flights) x
GROUP BY u.id) f ON u.id = f.user_id) t

How can I randomly distribute rows in one table to rows in another table in oracle SQL

I am trying to figure out a SQL query that will distribute records from one table to another table randomly.
for example :
I have a table of Customers, and I want to assign each a car out of a table of cars.
I want to make sure that the car are randomly distributed, but there is no property of an Customers that would predict which car they would receive.
Customers:
(Jon,Sam,Sara,Jack,Adam,Adrian)
Cars:
(BMW,Dodge,Lexus)
Result:
(Jon-BMW,Sam-Lexus,Sara-BMW,Jack-Dodge,Adam-Dodge,Adrian-BMW)
How can i do that in Oracle SQL?
Here's one option:
SQL> with t as
2 (select u.name ||'-'||a.name comb,
3 row_number() over (partition by u.name order by dbms_random.value(1, n.cnt)) rn
4 from customers u cross join cars a
5 join (select count(*) cnt from cars) n on 1 = 1
6 )
7 select t.comb
8 from t
9 where rn = 1;
COMB
-----------------------------------------
Adam-Lexus
Adrian-BMW
Jack-Lexus
Jon-BMW
Sam-Dodge
Sara-Lexus
6 rows selected.
SQL>
One method that might be more efficient than a full cross join is:
select c.*, cc.car
from (select c.*,
row_number() over (order by dbms_random.value(1, cc.cnt) as seqnum
from customers c cross join
(select count(*) as cnt from cars) cc
) c join
(select cc.*, row_number() over (order by dbms_random.random) as seqnum
from cars cc
) cc
on cc.seqnum = c.seqnum;
if no limit to use all cars and DB resources:
select customer_name||'-'||car_name result
from (
select u.name customer_name, c.name car_name,
row_number() over ( partition by u.name order by dbms_random.value ) ord
from customers u
cross join cars c
)
where ord = 1

find similarity of merchant with customers

I have a table in sql server 2012 that have this columns:
user_id , merchant_id
I want to find top 5 similar partners for each merchant.
The similarity is simply defined by normalized number of overlapping costumers;
i can not find any solution for this problem.
The following query counts the number of common customers for two merchants:
select t.merchantid as m1, t2.merchantid as m2, count(*) as common_customers
from table t join
table t2
on t.customerid = t2.customerid and t.merchantid <> t2.merchantid
group by t.merchantid, t2.merchantid;
The following gets the five based on the raw couns:
select *
from (select t.merchantid as m1, t2.merchantid as m2, count(*) as common_customers,
row_number() over (partition by t.merchantid order by count(*) desc) as seqnum
from table t join
table t2
on t.customerid = t2.customerid and t.merchantid <> t2.merchantid
group by t.merchantid, t2.merchantid
) mm
where seqnum <= 5;
I do not know what you mean by "normalized". The term "normalized" in statistics would often not change the ordering of values (but would result in the sum of the squares being 1), so this may do what you want.

SQL Sum of rows grouped by id

This SQL:
select Name,
(select COUNT(1) from tbl_projects where statusId = tbl_sections.StatusId) as N
from tbl_sections
left join tbl_section_names on tbl_section_names.Id = NameId
Generates the follows data:
Name N
Completed 133
Cancelled 100
Unassigned 1
Sales 49
Development 10
Development 4
Development 1
I'm trying to modify it so it returns the data as follows:
Name N
Completed 133
Cancelled 100
Unassigned 1
Sales 49
Development 15
(ie, sum up the rows where the name is the same)
Can anyone suggest some clues on how to make this work ? I'm guessing I need a SUM and a GROUP BY, but it never even runs the query as all I get are errors.
Try this query. It sums N grouped by Name.
SELECT Name, SUM(N)
FROM (
SELECT Name,
(SELECT COUNT(1)
FROM tbl_projects
WHERE statusId = tbl_sections.StatusId
) AS N
FROM tbl_sections
LEFT JOIN tbl_section_names ON tbl_section_names.Id = NameId
) a
GROUP BY a.Name
Try this
select Name, count(p.statusid) N
from tbl_sections
left join tbl_section_names on tbl_section_names.Id = NameId
left outer join tbl_projects p on tbl_sections.StatusId = p.statusId
group by Name
Select Name, Sum(N) from
(select Name,
(select COUNT(1) from tbl_projects where statusId = tbl_sections.StatusId) as N
from tbl_sections
left join tbl_section_names on tbl_section_names.Id = NameId)
group by Name
This query is giving you count per status, which means Development has sections with three different status's, and the query would reflect this and make more sense if you added the status as a column:
select Name, tbl_sections.StatusId,
(select COUNT(1) from tbl_projects where statusId = tbl_sections.StatusId) as N
from tbl_sections
left join tbl_section_names on tbl_section_names.Id = NameId
I don't know the structure of your database, but I if you want a count of the number of sections per name, might be like this. This basically will look at the result of the join, and then summarize it by telling you the number of times each unique name occurs:
select Name, count(*)
from tbl_sections
left join tbl_section_names on tbl_section_names.Id = NameId
Group By Name
Try and give this a go.
select Name,
SUM(select COUNT(1) from tbl_projects where statusId = tbl_sections.StatusId) as N
from tbl_sections
left join tbl_section_names on tbl_section_names.Id = NameId
group by Name

Join table on null where condition

I have tables Member and Transaction. Table Member has 2 columns MemberID and MemberName. Table Transaction has 3 columns, MemberID, TransactionDate, and MemberBalance.
The rows in the tables are as shown below:
Table Member:
MemberID MemberName
=============================
1 John
2 Betty
3 Lisa
Table Transaction:
MemberID TransactionDate MemberBalance
=====================================================
1 13-12-2012 200
2 12-12-2012 90
1 10-09-2012 300
I would like to query for MemberID, MemberName and MemberBalance where the TransactionDate is the latest (max) for each MemberID.
My query is like this:
SELECT
t.MemberID, m.MemberName , t.MemberBalance
FROM
Member AS m
INNER JOIN
Transaction AS t ON m.MemberID = t.MemberID
WHERE
t.TransactionDate IN (SELECT MAX(TransactionDate)
FROM Transaction
GROUP BY MemberID)
This query returns:
MemberID MemberName MemberBalance
===================================================
1 John 200
2 Betty 90
My problem is, I want the query to return:
MemberID MemberName MemberBalance
===================================================
1 John 200
2 Betty 90
3 Lisa NULL
I want the member to be displayed even if its MemberID does not exist in the Transaction table.
How do I do this?
Thank you.
You can also use something like this:
SELECT m.MemberID, m.MemberName, t1.MemberBalance
FROM Member AS m
LEFT JOIN
(
select max(transactionDate) transactionDate,
MemberID
from Transactions
group by MemberID
) AS t
ON m.MemberID = t.MemberID
left join transactions t1
on t.transactionDate = t1.transactionDate
and t.memberid = t1.memberid
See SQL Fiddle with Demo
member to be displayed even if its MemberID does not exist in Transaction table
You can preserve rows using LEFT JOIN on the Member table to Transaction table.
where the TransactionDate is the latest (max) for each MemberID.
From SQL Server 2005 onwards, the preferred and better performing method is to use ROW_NUMBER()
SELECT MemberID, MemberName, MemberBalance
FROM (
SELECT m.MemberID, m.MemberName , t.MemberBalance,
row_number() over (partition by m.MemberID order by t.TransactionDate desc) rn
FROM Member AS m
LEFT JOIN [Transaction] AS t ON m.MemberID = t.MemberID
) X
WHERE rn=1;
To keep member in the result set, you need an outer join.
Also, please don't forget to add a condition on memberid for inner select query, as you might get issues when a maximum date for one user would match a non-maximum date of another (your where condition would pass twice for the second user as his transaction dates would appear on the select's results twice, one would be his actual maximum date and another - the max date of some user matching a non-max date)
You need to use LEFT JOIN. Also you had an error in your query because if two members had transactions at the same time you can get two rows for both the users.
Try this
SELECT t.MemberID, m.MemberName , t.MemberBalance
FROM Member AS m
LEFT JOIN Transaction AS t ON m.MemberID = t.MemberID AND t.TransactionDate=
(
SELECT MAX(TransactionDate)
FROM Transaction T2
WHERE T2.MemberID=t.MemberID
)
SELECT a.MemberId,a.MemberName,a.MemberBalance
FROM
(
SELECT m.MemberId,m.MemberName,t1.MemberBalance
,ROW_NUMBER() OVER(PARTITION BY m.MemberId ORDER BY t1.TransactionDate DESC) AS RN
FROM
#Member m OUTER APPLY (SELECT t.MemberId,t.MemberBalance,t.TransactionDate
FROM #Transaction t WHERE m.MemberId=t.MemberId) t1
)a
WHERE a.RN=1