How to get these rows as columns in an SQL query - sql

I need some help in writing up this SQL query using a single table. Something like this
User ID
Category
Spend
Transactions
Country
1
Sport
30
2
USA
1
Bills
60
3
USA
2
Sport
10
1
MEX
3
Grocery
50
8
CAN
2
Grocery
70
4
MEX
3
Sport
20
5
CAN
3
Bills
30
2
CAN
1
Petrol
60
5
USA
I then want to group the rows by the User id and group the spend and transactions each by the category and having the country as a column by itself like this.
User ID
Sport_Spend
Bills_Spend
Grocery_Spend
Petrol_Spend
Sport_Transactions
Bills_Transactions
Grocery_Transactions
Petrol_Transactions
Country
1
30
60
0
60
2
3
0
5
USA
2
10
0
70
0
1
0
4
0
MEX
3
20
30
50
0
5
2
8
0
CAN
Its stumping me a bit would appreciate some help.

#jarlh comments are most relevant and need to be addressed. But here is something to start with: (ms sql code) (I opted out from transactions columns to reduce the problem, but the coding is just the same) https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=25550539029ba1c4be0826725bf9e00a
with data (UserID,Category,Spend,Transactions,Country) as(
select 1,'Sport',30,2,'USA' union all
select 1,'Bills',60,3,'USA' union all
select 2,'Sport',10,1,'MEX' union all
select 3,'Grocery',50,8,'CAN' union all
select 2,'Grocery',70,4,'MEX' union all
select 3,'Sport',20,5,'CAN' union all
select 3,'Bills',30,2,'CAN' union all
select 1,'Petrol',60,5,'USA'
)
select UserID
,isnull(SUM([Sport]),0)as Sport
,isnull(SUM([Bills]),0)as Bills
,isnull(SUM([Grocery]),0)as Grocery
,isnull(SUM([Petrol]),0)as Petrol
,MAX(Country)as Country
from (
select UserID,Category,Spend,Transactions,Country
from data) p
PIVOT(
SUM(SPEND)
For CATEGORY in ([Sport] ,[Bills] ,[Grocery] ,[Petrol])
)as PivotTable
group by UserID

select
COALESCE(user_id,0) as user_id,
COALESCE(Sport_Spend,0) as Sport_Spend,
COALESCE(Bills_Spend,0) as Bills_Spend,
COALESCE(Grocery_Spend,0) as Grocery_Spend,
COALESCE(Petrol_Spend,0) as Petrol_Spend,
COALESCE(Sport_Transactions,0) as Sport_Transactions,
COALESCE(Bills_Transactions,0) as Bills_Transactions,
COALESCE(Grocery_Transactions,0) as Grocery_Transactions,
COALESCE(Petrol_Transactions,0) as Petrol_Transactions
,country from
(SELECT DISTINCT user_id,country from table_name) as A
LEFT JOIN
(select user_id, spend as Sport_Spend ,transactions as Sport_Transactions from table_name where category='Sport') as B using (user_id)
LEFT JOIN
(select user_id, spend as Bills_Spend ,transactions as Bills_Transactions from table_name where category='Bills') as C using (user_id)
LEFT JOIN
(select user_id, spend as Grocery_Spend ,transactions as Grocery_Transactions from table_name where category='Grocery') as D using (user_id)
LEFT JOIN
(select user_id, spend as Petrol_Spend ,transactions as Petrol_Transactions from table_name where category='Petrol') as E using (user_id)
ORDER BY user_id;

Related

How to select IDs that have at least two specific instaces in a given column

I'm working with a medical claim table in pyspark and I want to return only userid's that have at least 2 claim_ids. My table looks something like this:
claim_id | userid | diagnosis_type | claim_type
__________________________________________________
1 1 C100 M
2 1 C100a M
3 2 D50 F
5 3 G200 M
6 3 C100 M
7 4 C100a M
8 4 D50 F
9 4 A25 F
From this example, I would want to return userid's 1, 3, and 4 only. Currently I'm building a temp table to count all of the distinct instances of the claim_ids
create table temp.claim_count as
select distinct userid, count(distinct claim_id) as claims
from medical_claims
group by userid
and then pulling from this table when the number of claim_id >1
select distinct userid
from medical_claims
where userid (
select distinct userid
from temp.claim_count
where claims>1)
Is there a better / more efficient way of doing this?
If you want only the ids, then use group by:
select userid, count(*) as claims
from medical_claims
group by userid
having count(*) > 1;
If you want the original rows, then use window functions:
select mc.*
from (select mc.*, count(*) over (partition by userid) as num_claims
from medical_claims mc
) mc
where num_claims > 1;

SQL Group By + Count with multiple tables

I'm studying for an interview next week which has a small data analysis component. The recruiter gave me the following sample SQL question which I'm having trouble wrapping my mind around a solution. I'm hoping that I'm not biting off more than I can chew ;)..
SAMPLE QUESTION:
You are given two tables:
AdClick Table (columns: ClickID, AdvertiserID, UserID, and other
fields) and AdConversion Table (columns: ClickID, UserID and other
fields).
You have to find the total conversion rate (# of conversions/# of
clicks) for users with 1 click, 2 click etc.
I've been playing with this for about an hour and keep hitting road blocks. I understand COUNT and GROUP BY but suspect I'm missing a simple SQL feature that I'm unaware of. This also makes it difficult for me to find any possible pointers/solutions via Google: not knowing the magic keywords to search on.
Example Input
dbo.AdConversion
----------------
ClickID UserID
1 1
2 1
4 1
5 3
6 2
7 2
12 1
9 4
10 4
dbo.AdClick
-----------
ClickID AdvertiserID UserID
1 1 1
2 2 1
3 1 2
4 1 1
5 1 3
6 2 2
7 3 2
8 1 1
9 4 4
10 2 4
11 3 4
12 2 1
Expected Result:
----------------
UserClickCount ConversionRate
4 80.00%
2 66.67%
1 100.00%
Explanation/Clarification:
Users with 4 AdConversion.ClickIDs (aka Conversions) have an 80% conversation rate.
Here there's just one user, UserID 1, which has 5 AdClicks with 4 AdConversions.
Users with 2 Conversions have a combined 6 Adclicks with 4 conversions for a conversion rate of 66.67%. Here, that'd be UserID 2 and 4.
Users with 1 Conversion, here only UserID 3, has 1 conversion against 1 AdClick for a 100% conversion rate.
Here's one possible solution I've come up with after some direction from Zack's comment. I can't imagine that it's the ideal solution or whether it has bugs in it or not:
DECLARE #Conversions TABLE
(
UserID int NOT NULL,
AdConversions int
)
INSERT INTO #Conversions (UserID, AdConversions)
SELECT adc.UserID, COUNT(adc.UserID)
FROM dbo.AdConversion adc
GROUP BY adc.UserID;
DECLARE #Clicks TABLE
(
UserID int NOT NULL,
AdClicks int
)
INSERT INTO #Clicks(UserID, AdClicks)
SELECT UserID, Count (ClickID)
FROM dbo.AdClick
GROUP BY UserID;
SELECT co.AdConversions, CONVERT(decimal(6,3), (CAST(SUM(co.AdConversions) AS float) / SUM(cl.AdClicks))) * 100
FROM #Conversions co
INNER JOIN #Clicks cl
ON co.UserID = cl.UserID
GROUP BY co.AdConversions;
Any advice would be greatly appreciated!
Thanks,
Michael
Your logic seems good. Here is a version with common table expressions and a little update with the numeric conversion:
WITH tConversions as
(SELECT UserID, COUNT(ClickID) as AdConversions
FROM AdConversion
GROUP BY UserID),
tClicks as
(SELECT UserID, COUNT(ClickID) as AdClicks
FROM AdClick
GROUP BY UserID)
SELECT co.AdConversions, CONVERT(decimal(10,2),CAST(SUM(co.AdConversions) as float) / SUM(cl.AdClicks) * 100) as ConversionRate
FROM tConversions co
INNER JOIN tClicks cl
ON co.UserID = cl.UserID
GROUP BY co.AdConversions
You can also use subqueries directly:
SELECT co.AdConversions, CONVERT(decimal(10,2),CAST(SUM(co.AdConversions) as float) / SUM(cl.AdClicks) * 100) as ConversionRate
FROM
(SELECT UserID, COUNT(ClickID) as AdConversions
FROM AdConversion
GROUP BY UserID)
as co
INNER JOIN
(SELECT UserID, COUNT(ClickID) as AdClicks
FROM AdClick
GROUP BY UserID)
as cl
ON co.UserID = cl.UserID
GROUP BY co.AdConversions

Identify same amounts over different users

Consider the following table Orders:
OrderID Name Amount
-----------------------
1 A 100
2 A 5
3 B 32
4 C 4000
5 D 701
6 E 32
7 F 200
8 G 100
9 H 12
10 I 17
11 J 100
12 J 100
13 J 11
14 A 5
I need to identify, for each unique 'Amount', if there are 2 or more users that have ordered that exact amount, and then list the details of those orders. So the desired output would be:
OrderID Name Amount
---------------------
1 A 100
8 G 100
11 J 100
12 J 100
3 B 32
6 E 32
please note that user A has ordered 2 x an order of 5 (order 2 and 14) but this shouldn't be in the output as it is within the same user. Only if another user would have made a order of 5, it should be in the output.
Can anyone help me out?
I would just use exists:
select o.*
from orders o
where exists (select 1
from orders o2
where o2.amount = o.amount and o2.name <> o.name
);
You can do :
select t.*
from table t
where exists (select 1 from table t1 where t1.amount = t.amount and t1.name <> t.name);
If you want only selected field then
SELECT Amount,name,
count(*) AS c
FROM TABLE
GROUP BY Amount, name
HAVING c > 1
ORDER BY c DESC
if you want full row
select * from table where Amount in (
select Amount, name from table
group by Amount, name having count(*) > 1)

Need to find the count of user who belongs to different depts

I have table with dept,user and so on, I need to find the number of count of user that belongs to different combinations of the dept.
Lets consider I've a table like this:
dept user
1 33
1 33
1 45
2 11
2 12
3 33
3 15
Then I've to find the uniq user and dept combination: something like this:
select distinct dept,user from x;
Which will give me result like :
Dept user
1 33
1 45
2 11
2 12
3 33
3 15
which actually removes the duplicates of the combination:
And here's the thing which i need to do :
My output should look like this:
dep_1_1 dep_1_2 dep_1_3 dep_2_2 dep_2_1 dep_2_3 Dep_3_1 Dep_3_2 Dep_3_3
2 0 1 2 0 0 1 0 2
So, Basically I need to find the count of common users between all the combinations of departments
Thanks for the help
You can get a row for each department combination using a self-join of your Distinct Select:
with cte as
(
select distinct dept,user from x
)
select t1.dept, t2.dept, count(*)
from cte a st1 join cte as t2
on t1.user = t2.user -- same user
and t1.dept < t2.dept -- different department
group by t1.dept, t2.dept
order by t1.dept, t2.dept

How to declare a row as a Alternate Row

id Name claim priority
1 yatin 70 5
6 yatin 1 10
2 hiren 30 3
3 pankaj 40 2
4 kavin 50 1
5 jigo 10 4
7 jigo 1 10
this is my table and i want to arrange this table as shown below
id Name claim priority AlternateFlag
1 yatin 70 5 0
6 yatin 1 10 0
2 hiren 30 3 1
3 pankaj 40 2 0
4 kavin 50 1 1
5 jigo 10 4 0
7 jigo 1 10 0
It is sorted as alternate group of same row.
I am Using sql server 2005. Alternate flag starts with '0'. In my example First record with name "yatin" so set AlternateFlag as '0'.
Now second record has a same name as "yatin" so alternate flag would be '0'
Now Third record with name "hiren" is single record, so assign '1' to it
In short i want identify alternate group with same name...
Hope you understand my problem
Thanks in advance
Try
SELECT t.*, f.AlternateFlag
FROM tbl t
JOIN (
SELECT [name],
AlternateFlag = ~CAST(ROW_NUMBER() OVER(ORDER BY MIN(ID)) % 2 AS BIT)
FROM tbl
GROUP BY name
) f ON f.name = t.name
demo
You could use probably an aggregate function COUNT() and then HAVING() and then UNION both Table, like:
SELECT id, A.Name, Claim, Priority, 0 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) > 1 ) A
ON YourTable.Name = A.Name
UNION ALL
SELECT id, B.Name, Claim, Priority, 1 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) = 1 ) B
ON YourTable.Name = B.Name
Now, this assumes that the Names are unique meaning the names like Yatin for example although has two counts is only associated to one person.
See my SqlFiddle Demo
You can use Row_Number() function with OVER that will give you enumeration, than use the reminder of integer division it by 2 - so you'll get 1s and 0s in your SELECT or in the view.