Related
TABLE A: Pre-joined table - Holds a list of providers who belong to a group and the group the provider belongs to. Columns are something like this:
ProviderID (PK, FK) | ProviderName | GroupID | GroupName
1234 | LocalDoctor | 987 | LocalDoctorsUnited
5678 | Physican82 | 987 | LocalDoctorsUnited
9012 | Dentist13 | 153 | DentistryToday
0506 | EyeSpecial | 759 | OphtaSpecialist
TABLE B: Another pre-joined table, holds a list of providers and their demographic information. Columns as such:
ProviderID (PK,FK) | ProviderName | G_or_I | OtherColumnsThatArentInUse
1234 | LocalDoctor | G | Etc.
5678 | Physican82 | G | Etc.
9012 | Dentist13 | I | Etc.
0506 | EyeSpecial | I | Etc.
The expected result is something like this:
ProviderID | ProviderName | ProviderStatus | GroupCount
1234 | LocalDoctor | Group | 2
5678 | Physican82 | Group | 2
9012 | Dentist13 | Individual | N/A
0506 | EyeSpecial | Individual | N/A
The goal is to determine whether or not a provider belongs to a group or operates individually, by the G_or_I column. If the provider belongs to a group, I need to include an additional column that provides the count of total providers in that group.
The Group/Individual portion is relatively easy - I've done something like this:
SELECT DISTINCT
A.ProviderID,
A.ProviderName,
CASE
WHEN B.G_or_I = 'G'
THEN 'Group'
WHEN B.G_or_I = 'I'
THEN 'Individual' END AS ProviderStatus
FROM
TableA A
LEFT OUTER JOIN TableB B
ON A.ProviderID = B.ProviderID;
So far so good, this returns the expected results based on the G_or_I flag.
However, I can't seem to wrap my head around how to complete the COUNT portion. I feel like I may be overthinking it, and stuck in a loop of errors. Some things I've tried:
Add a second CASE STATEMENT:
CASE
WHEN B.G_or_I = 'G'
THEN (
SELECT CountedGroups
FROM (
SELECT ProviderID, count(GroupID) AS CountedGroups
FROM TableA
WHERE A.ProviderID = B.ProviderID
GROUP BY ProviderID --originally had this as ORDER BY, but that was a mis-type on my part
)
)
ELSE 'N/A' END
This returns an error stating that a single row sub-query is returning more than one row. If I limit the number of rows returned to 1, the CountedGroups column returns 1 for every row. This makes me think that its not performing the count function as I expect it to.
I've also tried including a direct count of TableA as a factored sub-query:
WITH CountedGroups AS
( SELECT Provider ID, count(GroupID) As GroupSum
FROM TableA
GROUP BY ProviderID --originally had this as ORDER BY, but that was a mis-type on my part
) --This as a standalone query works just fine
SELECT DISTINCT
A.ProviderID,
A.ProviderName,
CASE
WHEN B.G_or_I = 'G'
THEN 'Group'
WHEN B.G_or_I = 'I'
THEN 'Individual' END AS ProviderStatus,
CASE
WHEN B.G_or_I = 'G'
THEN GroupSum
ELSE 'N/A' END
FROM
CountedGroups CG
JOIN TableA A
ON CG.ProviderID = A.ProviderID
LEFT OUTER JOIN TableB
ON A.ProviderID = B.ProviderID
This returns either null or completely incorrect column values
Other attempts have been a number of variations of this, with a mix of bad results or Oracle errors. As I mentioned above, I'm probably way overthinking it and the solution could be rather simple. Apologies if the information is confusing or I've not provided enough detail. The real tables have a lot of private medical information, and I tried to translate the essence of the issue as best I could.
Thank you.
You can use the CASE..WHEN and analytical function COUNT as follows:
SELECT
A.PROVIDERID,
A.PROVIDERNAME,
CASE
WHEN B.G_OR_I = 'G' THEN 'Group'
ELSE 'Individual'
END AS PROVIDERSTATUS,
CASE
WHEN B.G_OR_I = 'G' THEN TO_CHAR(COUNT(1) OVER(
PARTITION BY A.GROUPID
))
ELSE 'N/A'
END AS GROUPCOUNT
FROM
TABLE_A A
JOIN TABLE_B B ON A.PROVIDERID = B.PROVIDERID;
TO_CHAR is needed on COUNT as output expression must be of the same data type in CASE..WHEN
Your problem seems to be that you are missing a column. You need to add group name, otherwise you won't be able to differentiate rows for the same practitioner who works under multiple business entities (groups). This is probably why you have a DISTINCT on your query. Things looked like duplicates which weren't. Once you've done that, just use an analytic function to figure out the rest:
SELECT ta.providerid,
ta.providername,
DECODE(tb.g_or_i, 'G', 'Group', 'I', 'Individual') AS ProviderStatus,
ta.group_name,
CASE
WHEN tb.g_or_i = 'G' THEN COUNT(DISTINCT ta.provider_id) OVER (PARTITION BY ta.group_id)
ELSE 'N/A'
END AS GROUP_COUNT
FROM table_a ta
INNER JOIN table_b tb ON ta.providerid = tb.providerid
Is it possible that your LEFT JOIN was going the wrong direction? It makes more sense that your base demographic table would have all practitioners in it and then the Group table might be missing some records. For instance if the solo prac was operating under their own SSN and Type I NPI without applying for a separate Type II NPI or TIN.
Consider a simple where clause
select * from table_abc where col_a in (1,2,3)
I know the current conditions
If 1,2,3 are absent, I will not get any results
If 1,2,3 are present, I will get all results associated with 1,2,3
If 1 is present and 2,3 is absent, I will get only results associated with 1.
My question is if we can execute the query for the condition for
If 1 is present and 2,3 is absent, I should still get all results associated with 1,2,3
However, if 1,2,3 are absent, I will not get any results
In other words, can I have a particular value in the where-in clause set as mandatory? How can we change the current query?
EDIT : As pointed out in the comment, I have forgot to add the table structure. It is better that I explain the use case as well.
Table 1 : Admins
ID admin_id
-------------
1 001
2 002
Table 2 : Events
ID event_id
-------------
1 110
2 220
Table 3 : Admins_Events
admin_id event_id
-------------
001 110
001 220
002 220
Now, as a part of filtering, let's say I have the query
SELECT "admins"."admin_id", "events"."event_id" FROM "admins"
LEFT JOIN "admins_events" ON "admins_events"."admin_id" = "admins"."admin_id"
LEFT JOIN "events" ON "events"."event_id" = "admins_events"."event_id"
WHERE (events.event_id IN (110) AND admins.admin_id IN (001))
And currently, I am getting the results as
admin_id event_id
-------------
001 110
where as I would want something like
admin_id event_id
-------------
001 110
001 220
I have to still show the other events associated with the admin even though I do not pass it in the where-in clause. I was thinking to pass all the event_id's every time and match the mandatory event_id and also match the remaining event_ids in case the mandatory event_id is found.
SELECT "admins"."admin_id", "events"."event_id" FROM "admins"
LEFT JOIN "admins_events" ON "admins_events"."admin_id" = "admins"."admin_id"
LEFT JOIN "events" ON "events"."event_id" = "admins_events"."event_id"
WHERE (events.event_id IN (mandatory[110], 220) AND admins.admin_id IN (001))
How can I change the query?
Add another condition with EXISTS in the WHERE clause:
SELECT a.admin_id, e.event_id
FROM Admins a
LEFT JOIN Admins_Events ae ON ae.admin_id = a.admin_id
LEFT JOIN Events e ON e.event_id = ae.event_id
WHERE (e.event_id IN (110, 220) AND a.admin_id IN (001))
AND EXISTS (
SELECT 1 FROM Admins_Events
WHERE event_id = 110 AND admin_id = a.admin_id
)
See the demo.
Results:
| admin_id | event_id |
| -------- | -------- |
| 1 | 110 |
| 1 | 220 |
It sounds like in your example you want all events associated with admins who are associated with event 110. In Mysql I'd do it with the following query, which joins admins to events twice: once to filter for the event you need, and once to get all the other events.
However, in your example, you don't need the admins table in the query at all, since you just need the admin ID, which you can get directly from the admins_events table. I left it in, in case your real "admins" table also had other attributes you wanted (name, location, etc) which are not available in the admins_events join table.
The specific query is:
SELECT "admins"."admin_id", "events"."event_id" FROM "admins"
JOIN "admins_events" ON "admins_events"."admin_id" = "admins"."admin_id"
JOIN "events" ON "events"."event_id" = "admins_events"."event_id"
JOIN "admins_events" AS "specific_admins_events" ON "specific_admins_events"."admin_id" = "admins"."admin_id"
JOIN "events" AS "specific_events" ON "specific_events"."event_id" = "specific_admins_events"."event_id"
WHERE (specific_events.event_id IN (110))
First, you only need the admin_events table.
Then method uses window functions:
SELECT ae.*
FROM (SELECT ae.*,
SUM(CASE WHEN ae.event_id IN (110) THEN 1 ELSE 0 END) OVER (PARTITION BY ae.admin_id) as num_110
FROM admins_events ae
) ae
WHERE ae.admin_id IN ('001') AND -- assume this is a string
num_110 > 0 AND
ae.event_id IN (110, 220);
I have a table similar to this:
stud_ID | first_name | last_name | email | col_num | user_value
1 tom smith 50 Retail
1 tom smith 60 Product
2 Sam wright 50 Retail
2 Sam wright 60 Sale
but need to convert it to: (basically transpose 'col_num' to column headers and change 50 to function, 60 to department)
stud_ID | first_name | last_name | email | Function | Department
1 tom smith Retail Product
2 Sam wright Retail Sale
Unfortunately Pivot doesn't work in my system, just wondering if there is any other way to do this please?
The code that I have so far (sorry for the long list):
SELECT c.person_id_external as stu_id,
c.lname,
c.fname,
c.mi,
a.cpnt_id,
a.cpnt_typ_id,
a.rev_dte,
a.rev_num,
cp.cpnt_title AS cpnt_desc,
a.compl_dte,
a.CMPL_STAT_ID,
b.cmpl_stat_desc,
b.PROVIDE_CRDT,
b.INITIATE_LEVEL1_SURVEY,
b.INITIATE_LEVEL3_SURVEY,
a.SCHD_ID,
a.TOTAL_HRS,
a.CREDIT_HRS,
a.CPE_HRS,
a.CONTACT_HRS,
a.TUITION,
a.INST_NAME,
--a.COMMENTS,
a.BASE_STUD_ID,
a.BASE_CPNT_TYP_ID,
a.BASE_CPNT_ID,
a.BASE_REV_DTE,
a.BASE_CMPL_STAT_ID,
a.BASE_COMPL_DTE,
a.ES_USER_NAME,
a.INTERNAL,
a.GRADE_OPT,
a.GRADE,
a.PMT_ORDER_TICKET_NO,
a.TICKET_SEQUENCE,
a.ORDER_ITEM_ID,
a.ESIG_MESSAGE,
a.ESIG_MEANING_CODE_ID,
a.ESIG_MEANING_CODE_DESC,
a.CPNT_KEY,
a.CURRENCY_CODE,
c.EMP_STAT_ID,
c.EMP_TYP_ID,
c.JL_ID,
c.JP_ID,
c.TARGET_JP_ID,
c.JOB_TITLE,
c.DMN_ID,
c.ORG_ID,
c.REGION_ID,
c.CO_ID,
c.NOTACTIVE,
c.ADDR,
c.CITY,
c.STATE,
c.POSTAL,
c.CNTRY,
c.SUPER,
c.COACH_STUD_ID,
c.HIRE_DTE,
c.TERM_DTE,
c.EMAIL_ADDR,
c.RESUME_LOCN,
c.COMMENTS,
c.SHIPPING_NAME,
c.SHIPPING_CONTACT_NAME,
c.SHIPPING_ADDR,
c.SHIPPING_ADDR1,
c.SHIPPING_CITY,
c.SHIPPING_STATE,
c.SHIPPING_POSTAL,
c.SHIPPING_CNTRY,
c.SHIPPING_PHON_NUM,
c.SHIPPING_FAX_NUM,
c.SHIPPING_EMAIL_ADDR,
c.STUD_PSWD,
c.PIN,
c.PIN_DATE,
c.ENCRYPTED,
c.HAS_ACCESS,
c.BILLING_NAME,
c.BILLING_CONTACT_NAME,
c.BILLING_ADDR,
c.BILLING_ADDR1,
c.BILLING_CITY,
c.BILLING_STATE,
c.BILLING_POSTAL,
c.BILLING_CNTRY,
c.BILLING_PHON_NUM,
c.BILLING_FAX_NUM,
c.BILLING_EMAIL_ADDR,
c.SELF_REGISTRATION,
c.SELF_REGISTRATION_DATE,
c.ACCESS_TO_ORG_FIN_ACT,
c.NOTIFY_DEV_PLAN_ITEM_ADD,
c.NOTIFY_DEV_PLAN_ITEM_MOD,
c.NOTIFY_DEV_PLAN_ITEM_REMOVE,
c.NOTIFY_WHEN_SUB_ITEM_COMPLETE,
c.NOTIFY_WHEN_SUB_ITEM_FAILURE,
c.LOCKED,
c.PASSWORD_EXP_DATE,
c.SECURITY_QUESTION,
c.SECURITY_ANSWER,
c.ROLE_ID,
c.IMAGE_ID,
c.GENDER,
c.PAST_SERVICE,
c.LST_UNLOCK_TSTMP,
c.MANAGE_SUB_SP,
c.MANAGE_OWN_SP,
d.col_num,
d.user_value
FROM pa_cpnt_evthst a,
pa_cmpl_stat b,
pa_student c,
pv_course cp,
pa_stud_user d
WHERE a.cmpl_stat_id = b.cmpl_stat_id
AND a.stud_id = c.stud_id
AND cp.cpnt_typ_id(+) = a.cpnt_typ_id
AND cp.cpnt_id(+) = a.cpnt_id
AND cp.rev_dte(+) = a.rev_dte
AND a.CPNT_TYP_ID != 'SYSTEM_PROGRAM_ENTITY'
AND c.stud_id = d.stud_id
AND d.col_num in ('10','30','50','60')
I would just use conditional aggregation:
select stud_ID, first_name, last_name, email,
max(case when col_num = 50 then user_value end) as function,
max(case when col_num = 60 then user_value end) as department
from t
group by stud_ID, first_name, last_name, email;
Your code seems to have nothing to do with the sample data. I do notice however that you are using implicit join syntax. You really need to learn how to use proper, explicit, standard JOIN syntax.
I'm assuming you have Sql Server 2000 or 2003. What you need to do in that case is create a script with one cursor.
This cursor will create a text with something like this:
string var = "CREATE TABLE #Report (Col1 VARCHAR(20), Col2, VARCHAR(20), " + ColumnName
That way you can create a temp table on the fly, at the end you will need to do a Select of your temp table to get your pivot table ready.
Its not that easy if you are not familiar with cursors.
OR
if there are only few values on your 'pivot' column and they are not going to grow you can also do something like this:
Pivot using SQL Server 2000
I'm unable to understand your code, so I'll just assume the table mentioned in the sample data as stud(because of stud_id).
So here is what I think can do the work of pivot.
SELECT ISNULL(s1.stud_ID, s2.stud_id),
ISNULL(s1.first_name, s2.first_name),
ISNULL(s1.last_name, s2.last_name),
ISNULL(s1.email, s2.email),
s1.user_value as [Function], s2.user_value as Department
FROM stud s1 OUTER JOIN stud s2
ON s1.stud_ID = s2.stud_ID -- Assuming stud_ID is primary key, else join on all primary keys
AND s1.col_num = 50 AND s2.col_num = 60
Explanation: I'm just trying to simulate here what PIVOT does. For every column you want, you create a new table in the JOIN and constaint it to only one value in your col_num column. For example, if there are no values for 50 in s1, the OUTER JOIN will get make it NULL and we need to pull records from s2.
Note: If you need more than 2 new columns, then you can use COALESCE instead of ISNULL
Trying to pull data from a single table called tblTooling where two TlPartNo numbers are equal to different values and the TlToolNo are not equal for these TlPartNo . This is an Access DB and the following statement gets me close, but still gives too much data.
SELECT DISTINCT
tblTooling.TlToolNo,
tblTooling.TlPartNo,
tblTooling.TlOP,
tblTooling.TlQuantity
FROM tblTooling, tblTooling AS tblTooling_1
WHERE (((tblTooling.TlToolNo)<>tblTooling_1.TlToolNo)
AND ((tblTooling.TlPartNo)="10290722")
AND ((tblTooling_1.TlPartNo)="10295379"));
The included image has the tblTooling structure and Data. Plus the expected results from the query.
You seem to want exclude a ToolNo value when it occurs with both PartNo values. In that case you could group intermediate results by ToolNo, and see whether in such a group there is only one PartNo present (with having). In that case keep that record, and in the outer query, get the two other columns added to it:
SELECT DISTINCT
tblTooling.TlToolNo,
tblTooling.TlPartNo,
tblTooling.TlOP,
tblTooling.TlQuantity
FROM tblTooling
INNER JOIN (
SELECT TlToolNo,
Min(TlPartNo) AS MinTlPartNo,
Max(TlPartNo) AS MaxTlPartNo
FROM tblTooling
WHERE TlPartNo IN ("10290722", "10295379")
GROUP BY TlToolNo
HAVING Min(TlPartNo) = Max(TlPartNo)
) AS grp
ON grp.TlToolNo = tblTooling.TlToolNo
AND grp.MinTlPartNo = tblTooling.TlPartNo
Note that for your sample data this will return 4 rows:
TlToolNo | TlPartNo | TlOP | TlQuantity
----------+----------+------+-----------
T00012362 | 10290722 | OP10 | 2
T00012456 | 10290722 | OP10 | 1
T00013456 | 10290722 | OP20 | 1
T00014348 | 10295379 | OP20 | 1
I think you can do this with not exists:
select t.*
from tblTooling as t
where not exists (select 1
from tblTooling as t2
where t2.TlPartNo in ("10290722", "10295379") and
t2.TlToolNo = t.TlToolNo and
t2.tiid <> t.tiid
) and
t.TlPartNo in ("10290722", "10295379");
This saves on the select distinct, which should be a performance boost.
I have a schema (millions of records with proper indexes in place) that looks like this:
groups | interests
------ | ---------
user_id | user_id
group_id | interest_id
A user can like 0..many interests and belong to 0..many groups.
Problem: Given a group ID, I want to get all the interests for all the users that do not belong to that group, and, that share at least one interest with anyone that belongs to the same provided group.
Since the above might be confusing, here's a straightforward example (SQLFiddle):
| 1 | 2 | 3 | 4 | 5 | (User IDs)
|-------------------|
| A | | A | | |
| B | B | B | | B |
| | C | | | |
| | | D | D | |
In the above example users are labeled with numbers while interests have characters.
If we assume that users 1 and 2 belong to group -1, then users 3 and 5 would be interesting:
user_id interest_id
------- -----------
3 A
3 B
3 D
5 B
I already wrote a dumb and very inefficient query that correctly returns the above:
SELECT * FROM "interests" WHERE "user_id" IN (
SELECT "user_id" FROM "interests" WHERE "interest_id" IN (
SELECT "interest_id" FROM "interests" WHERE "user_id" IN (
SELECT "user_id" FROM "groups" WHERE "group_id" = -1
)
) AND "user_id" NOT IN (
SELECT "user_id" FROM "groups" WHERE "group_id" = -1
)
);
But all my attempts to translate that into a proper joined query revealed themselves fruitless: either the query returns way more rows than it should or it just takes 10x as long as the sub-query, like:
SELECT "iii"."user_id" FROM "interests" AS "iii"
WHERE EXISTS
(
SELECT "ii"."user_id", "ii"."interest_id" FROM "groups" AS "gg"
INNER JOIN "interests" AS "ii" ON "gg"."user_id" = "ii"."user_id"
WHERE EXISTS
(
SELECT "i"."interest_id" FROM "groups" AS "g"
INNER JOIN "interests" AS "i" ON "g"."user_id" = "i"."user_id"
WHERE "group_id" = -1 AND "i"."interest_id" = "ii"."interest_id"
) AND "group_id" != -1 AND "ii"."user_id" = "iii"."user_id"
);
I've been struggling trying to optimize this query for the past two nights...
Any help or insight that gets me in the right direction would be greatly appreciated. :)
PS: Ideally, one query that returns an aggregated count of common interests would be even nicer:
user_id totalInterests commonInterests
------- -------------- ---------------
3 3 1/2 (either is fine, but 2 is better)
5 1 1
However, I'm not sure how much slower it would be compared to doing it in code.
Using the following to set up test tables
--drop table Interests ----------------------------
CREATE TABLE Interests
(
InterestId char(1) not null
,UserId int not null
)
INSERT Interests values
('A',1)
,('A',3)
,('B',1)
,('B',2)
,('B',3)
,('B',5)
,('C',2)
,('D',3)
,('D',4)
-- drop table Groups ---------------------
CREATE TABLE Groups
(
GroupId int not null
,UserId int not null
)
INSERT Groups values
(-1, 1)
,(-1, 2)
SELECT * from Groups
SELECT * from Groups
The following query would appear to do what you want:
DECLARE #GroupId int
SET #GroupId = -1
;WITH cteGroupInterests (InterestId)
as (-- List of the interests referenced by the target group
select distinct InterestId
from Groups gr
inner join Interests nt
on nt.UserId = gr.UserId
where gr.GroupId = #GroupId)
-- Aggregate interests for each user
SELECT
UserId
,count(OwnInterstId) OwnInterests
,count(SharedInterestId) SharedInterests
from (-- Subquery lists all interests for each user
select
nt.UserId
,nt.InterestId OwnInterstId
,cte.InterestId SharedInterestId
from Interests nt
left outer join cteGroupInterests cte
on cte.InterestId = nt.InterestId
where not exists (-- Correlated subquery: is "this" user in the target group?)
select 1
from Groups gr
where gr.GroupId = #GroupId
and gr.UserId = nt.UserId)) xx
group by UserId
having count(SharedInterestId) > 0
It appears to work, but I'd want to do more elaborate tests, and I've no idea how well it'd work against millions of rows. Key points are:
cte creates a temp table referenced by the later query; building an actual temp table might be a performance boost
Correlated subqueries can be tricky, but indexes and not exists should make this pretty quick
I was lazy and left out all the underscores, sorry
This is a bit confounding. I think the best approach is exists and not exists:
select i.*
from interest i
where not exists (select 1
from groups g
where i.user_id = g.user_id and
g.group_id = $group_id
) and
exists (select 1
from groups g join
interest i2
on g.user_id = i2.user_id
where g.user_id <> i.user_user_id and
i.interest_id = i2.interest_id
);
The first subquery is saying that the user is not in the group. The second is saying that the interest is shared with someone who is in the group.