SQL several characteristics query - sql

I have a problem with my SQL query. I have operations with (specific ID for each one) and I have participants in those operations that can be (seller, facilitator, manager, assistant)
Table looks like:
ID Volume Participant
---------------------------
122 100 Sellers
122 100 Facilitator
123 50 Sellers
123 50 Manager
123 50 Facilitator
124 120 Sellers
124 120 Assistant
125 180 Manager
125 180 Sellers
125 180 Facilitator
I want to extract operations where, for example, seller and manager have participated. In this case, the seller and manager have participated in operations 123 and 125
SELECT ops.opsId, ops.opsvolume, tranche.participant
FROM ops
INNER JOIN tranche ON ops.opsID = tranche.opsId
WHERE tranche.participant = 'seller'
AND tranche.participant = 'manager'
But obviously the participants can not be two roles at the same time, it is the operation that has several roles, any suggestions?

There's not enough information in the question yet to fully answer it. But perhaps you can start with this:
SELECT * FROM ops WHERE ID IN (
SELECT ID
FROM ops
WHERE participant IN ('sellers', 'manager')
GROUP BY ID
HAVING COUNT(*) = 2
)

I would start by pivoting out new columns, after which you can simply use a WHERE clause
with q as
(
SELECT ops.opsId,
max(ops.opsvolume) Volume,
max(case when tranche.Participant = 'seller' then 1 else 0 end) HasSeller,
max(case when tranche.Participant = 'manager' then 1 else 0 end) HasManager
FROM ops
INNER JOIN tranche ON ops.opsID = tranche.opsId
GROUP BY ops.opsId
)
select opsId, Volume
from q
where HasSeller = 1
and HasManager = 1

Related

Calculate total donations based on an attribute table

I am trying to get a list of donors who have cumulatively donated $5K+ between two different campaigns. My data is something like this
Attributes table
transactionid
attributevalue
123231
campaign 1
123456
campaing 2
123217
campaign 1
45623
campaing 2
65791
campaing 3
78931
campaign 4
11111
campaign 5
22222
campaing 6
Donations table
transactionid
donationamount
donorid
123231
2000
1233
123456
30000
1456
45623
8000
1233
78931
90
8521
11111
20
1233
22222
68
1456
Donor table
donorid
name
1233
John
1456
Mary
8521
Karl
This is what I tried, but the total I am getting is not right at all.
WITH test AS (
SELECT don.donorid,don.donationamount,a.attributevalue
FROM attributes table a
INNER JOIN donations don ON don.transactionid=a.transactionid
)
SELECT d.donorid,
SUM(CASE WHEN test.attributevalue='campaign 1' OR test.attributevalue='campaign 2'
THEN test.donationamount END) AS campaing_donation,
SUM(test.donationamount) AS total_donations
FROM donortable d
INNER JOIN test ON d.donorid = test.donorid
GROUP BY d.donorid
HAVING SUM(CASE WHEN test.attributevalue = 'campaign 1' OR test.attributevalue = 'campaign 2' THEN test.donationamount END) > 5000
but this is not working. My total donations sum is giving a value that is several times higher than the actual value.
Ideally, the final result would be something like this:
donorid
campaign_amount
totalamount
1233
10000
10020
1456
30000
30068
Select
sum (Donations.donationamount),
donor.donorid,
donor.name
from
Attributes
join Donations on
Donations.transactionid = attributes.transactionid
Join Donor on
donor.donorid = donations.donorid
Where
Attribute.attributevalue in ('campaign 1','campaign 2')
Group by
donor.donorid,
donor.name
create table #transection_tbl(tran_id int,attributevalue varchar(20))
create table #donation_tbl(tran_id int,donation_amount int ,donar_id int)
select donar_id,max(donation_amount) as 'campaing_amount',
sum(donation_amount) as 'totalamount'
from #transection_tbl as t1
inner join #donation_tbl as t2 on t1.tran_id=t2.tran_id
group by donar_id
having COUNT(attributevalue)=2

How to Select ID's in SQL (Databricks) in which at least 2 items from a list are present

I'm working with patient-level data in Azure Databricks and I'm trying to build out a cohort of patients that have at least 2 diagnoses from a list of specific diagnosis codes. This is essentially what the table looks like:
CLAIM_ID | PTNT_ID | ICD_CD | DATE
---------+---------+--------+------------
1 101 2500 01_25_2020
2 101 3850 03_13_2018
3 222 2500 10_26_2018
4 222 8888 11_30_2018
5 222 9155 04_01_2019
6 871 2500 02_17_2020
7 871 3200 09_09_2019
The list of ICD_CD codes of interest is something like [2500, 3850, 8888]. In this case, I would want to return TOTAL UNIQUE PTNT_ID = 2. These would be PTNT_ID = (101, 222) as these are the only two patients that have at least 2 ICD_CD codes of interest.
When I use something like this, I'm able to return all of the relevant PTNT_ID values, but I'm not able to get the total count of these PTNT_ID:
select mc.PTNT_ID
from MEDICAL_CLAIMS mc
where mc.PTNT_ID in ( # list of ICD_CD of interest
)
group by mc.PTNT_ID
having count(distinct mc.PTNT) >= 2
When I try to add a COUNT statement in, it returns an error
Just select from the query:
select count(*)
from
(
select mc.PTNT_ID
from MEDICAL_CLAIMS mc
where mc.PTNT_ID in ( # list of ICD_CD of interest )
group by mc.PTNT_ID
having count(distinct mc.PTNT) >= 2
) ptnts;

SQL Server case statement in select clause

I have a query which gives me StudentId and PercentageScored in Exams which the Student attended, a student can attend multiple exams
StudentId | PercentageScored
101 82
102 57
101 69
103 71
103 42
Below is a sample query, my actual query looks similar to the below.
Select s.StudentId, m.[PercentageScored]
FROM dbo.[Student] S
Inner join dbo.[Marks] m
ON S.[StudentId] = m.[StudentId]
WHERE S.[StudentGroup] = 12 AND S.[Active] = 1
Now i need to add some logic so that my output looks like below
StudentId | FirstClass | SecondClass | ThirdClass
101
102
103
104
If the students PercentageScored is above 80% then 1st class, PercentageScored between 60 to 80 % then 2nd class,PercentageScored below 60% then 3rd class.. I need to give the counts, for a given student how many times he scored more than 80%, how many times between 60 - 80%, how many times below 60%
Using conditional aggregation:
SELECT
s.StudentId,
COUNT(CASE WHEN m.[PercentageScored] > 80 THEN 1 END) AS FirstClass,
COUNT(CASE WHEN m.[PercentageScored] > 60 AND
m.[PercentageScored] <= 80 THEN 1 END) AS SecondClass,
COUNT(CASE WHEN m.[PercentageScored] <= 60 THEN 1 END) AS ThirdClass
FROM dbo.[Student] s
INNER JOIN dbo.[Marks] m
ON s.[StudentId] = m.[StudentId]
WHERE
s.[StudentGroup] = 12 AND s.[Active] = 1
GROUP BY
s.StudentId;

How do I write sql query from this result?

I wasn't sure what could be the title for my question so sorry about that.
I'm trying to write a SQL query to achieve the no. of members who should get reimbursed from a pharmacy.
For example : I went to pharmacy, I took a vaccine but by mistake I paid from my pocket. so now Pharmacy needs to reimburse me that amount. Lets say I have the data like:
MemberId Name ServiceDate PresNumber PersonId ClaimId AdminFee(in $)
1 John 1/1/2011 123 345 456 0
1 John 1/21/2011 123 345 987 20
2 Mike 2/3/2011 234 567 342 0
2 Mike 2/25/2011 234 567 564 30
5 Linda 1/4/2011 432 543 575 0
5 Linda 4/6/2011 987 543 890 0
6 Sonia 2/6/2011 656 095 439 0
This data shows all members from that pharmacy who got reimbursed and who haven't.
I need to find out the member having AdminFee 0 but i also need to check another record for the same member having same PresNumber, same PersonId where the ServiceDate falls within 30 Days of the Original Record.
If another record meets this criteria and the AdminFee field contains a value (is NOT 0) then it means that person has already been reimbursed. So from the data you can see John and Mike have already been reimbursed and Linda and Sonia need to be reimbursed.
Can anybody help me how to write an SQL query on this?
You don't mention what SQL engine you're using, so here is some generic SQL. You'll need to adapt the date math and the return of True/False ( in the second option) to whatever engine you're using:
-- Already reimbursed
SELECT * FROM YourTable YT1 WHERE AdminFee = 0 AND EXISTS
(SELECT * FROM YourTable YT2
WHERE YT2.MemberID = YT1.MemberID AND
YT2.PresNumber = YT1.PresNumber AND
YT2.ServiceDate >= YT1.ServiceDate - 30 AND
AdminFee > 0)
-- Need reimbursement
SELECT * FROM YourTable YT1 WHERE AdminFee = 0 AND NOT EXISTS
(SELECT * FROM YourTable YT2
WHERE YT2.MemberID = YT1.MemberID AND
YT2.PresNumber = YT1.PresNumber AND
YT2.ServiceDate >= YT1.ServiceDate - 30 AND
AdminFee > 0)
or
-- Both in one.
SELECT YT1.*,
CASE WHEN YT2.MemberID IS NULL THEN False ELSE True END AS AlreadyReimbursed
FROM YourTable YT1 JOIN YourTable YT2 ON
YT1.MemberID = YT2.MemberID AND
YT1.PresNumber = YT2.PresNumber AND
YT1.ServiceDate <= YT2.ServiceDate + 30
WHERE YT1.AdminFee = 0 AND YT2.AdminFee > 0)
You need to use datediff function in SQL Server and as parameter to pass day and to join the table above by other alias. I do not have SQL Server but I think it should be like this
Select memberid
from PaymentLog p
inner join PaymentLog d on p.serviceid = d.serviceid
and p.memberid = d.memberid
and p.personid = d.personid
Where adminfee = 0
and datediff(day, p.servicedate, d.servicedate) < 30
I called a table paymentlog

SQL: How do I count the number of clients that have already bought the same product?

I have a table like the one below. It is a record of daily featured products and the customers that purchased them (similar to a daily deal site). A given client can only purchase a product one time per feature, but they may purchase the same product if it is featured multiple times.
FeatureID | ClientID | FeatureDate | ProductID
1 1002 2011-05-01 500
1 2333 2011-05-01 500
1 4458 2011-05-01 500
2 8888 2011-05-10 700
2 2333 2011-05-10 700
2 1111 2011-05-10 700
3 1002 2011-05-20 500
3 4444 2011-05-20 500
4 4444 2011-05-30 500
4 2333 2011-05-30 500
4 1002 2011-05-30 500
I want to count by FeatureID the number of clients that purchased FeatureID X AND who purchased the same productID during a previous feature.
For the table above the expected result would be:
FeatureID | CountofReturningClients
1 0
2 0
3 1
4 3
Ideally I would like to do this with SQL, but am also open to doing some manipulation in Excel/PowerPivot. Thanks!!
If you join your table to itself, you can find the data you're looking for. Be careful, because this query can take a long time if the table has a lot of data and is not indexed well.
SELECT t_current.FEATUREID, COUNT(DISTINCT t_prior.CLIENTID)
FROM table_name t_current
LEFT JOIN table_name t_prior
ON t_current.FEATUREDATE > t_prior.FEATUREDATE
AND t_current.CLIENTID = t_prior.CLIENTID
AND t_current.PRODUCTID = t_prior.PRODUCTID
GROUP BY t_current.FEATUREID
"Per feature, count the clients who match for any earlier Features with the same product"
SELECT
Curr.FeatureID
COUNT(DISTINCT Prev.ClientID) AS CountofReturningClients --edit thanks to feedback
FROM
MyTable Curr
LEFT JOIN
MyTable Prev WHERE Curr.FeatureID > Prev.FeatureID
AND Curr.ClientID = Prev.ClientID
AND Curr.ProductID = Prev.ProductID
GROUP BY
Curr.FeatureID
Assumptions: You have a table called Features that is:
FeatureID, FeatureDate, ProductID
If not then you could always create one on the fly with a temporary table, cte or view.
Then:
SELECT
FeatureID
, (
SELECT COUNT(DISTINCT ClientID) FROM Purchases WHERE Purchases.FeatureDate < Feature.FeatureDate AND Feature.ProductID = Purchases.ProductID
) as CountOfReturningClients
FROM Features
ORDER BY FeatureID
New to this, but wouldn't the following work?
SELECT FeatureID, (CASE WHEN COUNT(clientid) > 1 THEN COUNT(clientid) ELSE 0 END)
FROM table
GROUP BY featureID