How do I identify like sets using SQL? - sql

Using SQL Server, I have a table that looks like this:
What I need to do is write a query to identify scenarios where the Name and Permissions field are equal so that I can give give them a unique Set ID.
For instance, rows 2 and 4 would be a set I can give a SetID as well as rows 6 and 7 are a set that I can give another SetID. But rows 2 and 3 are NOT a set.
So far I have tried using DENSE_RANK () Over(Order by Name) which helps to add an id based on like Names but doesn't take into account matching permissions. And have tried joining the table on itself but with millions of rows of data I end up with unwanted duplicates.
The logic I am following is this:
If (Name and Permissions) of one row = (Name and Permissions) of another row give them a SetID to share.
Please help I have been banging my head against the wall with this one. Ideally a SQL query would accomplish this but am open to anything.
Thank you!

You could do it for example like this:
select
Name,
Permission,
row_number() over (order by Name, Permission) as RN
from (
select distinct
Name,
Permission
from
permissions
) TMP
order by Name, Permission
The inner select gets the distinct combinations, and the outer one assigns the numbers.
SQL Fiddle: http://sqlfiddle.com/#!6/c8319/3

This will probably do something similar to what you want.
SELECT
name,
permissions,
accountname,
ROW_NUMBER() OVER (PARTITION BY name,permissions ORDER By name,permissions) as SetID
FROM table;

Related

SQL JOIN to select MAX value among multiple user attempts returns two values when both attempts have the same value

Good morning, everyone!
I have a pretty simple SELECT/JOIN statement that gets some imported data from a placement test and returns the highest scored attempt a user made, the best score. Users can take this test multiple times, so we just use the best attempt. What if a user makes multiple attempts (say, takes it twice,) and receives the SAME score both times?
My current query ends up returning BOTH of those records, as they're both equal, so MAX() returns both. There are no primary keys setup on this yet--the query I'm using below is the one I hope to add into an INSERT statement for another table, once I only get a SINGLE best attempt per User (StudentID), and set that StudentID as the key. So you see my problem...
I've tried a few DISTINCT or TOP statements in my query but either I'm putting them into the wrong part of the query or they still return two records for a user who had identically scored attempts. Any suggestions?
SELECT p.*
FROM
(SELECT
StudentID, MAX(PlacementResults) AS PlacementResults
FROM AleksMathResults
GROUP BY StudentID)
AS mx
JOIN AleksMathResults p ON mx.StudentID = p.StudentID AND mx.PlacementResults = p.PlacementResults
ORDER BY
StudentID
Sounds like you want row_number():
SELECT amr.*
FROM (SELECT amr.*
ROW_NUMBER() OVER (PARTITION BY StudentID ORDER BY PlacementResults DESC) as seqnum
FROM AleksMathResults amr
) amr
WHERE seqnum = 1;

De-Dupe Query in MS Access

I'm using Access to create a report. I have several duplicate records in my Avail_Amt field. I used the Query Wizard to de-dup records from a table, like this.
In (SELECT [DEALLOCALBAL] FROM [TBL_DATA] As Tmp GROUP BY [DEALLOCALBAL] HAVING Count(*)=1 )
What I really want to do, though, is de-dupe the records in my Query, not in my Table. I think it should look like the script below, but it's not working.
In (SELECT [DEALLOCALBAL] FROM [dbo_LIMIT_HIST.Avail_Amt] As Tmp GROUP BY [DEALLOCALBAL] HAVING Count(*)=1 )
dbo_LIMIT_HIST is the Table in SQL Server and Avail_Amt is the Field that I am trying to de-dupe. Any idea what could be wrong here?
My data set looks like this:
Basically, for each Contact ID I could have multiple DEALLOCALBAL amounts. I want to capture each unique amount for each Contact ID, but I don't want dupes.
Thanks!!

SQL Finding Duplicates with multiple criteria but different ID numbers

I work with claims and I have been trying to write a query that captures different claim numbers with multiple criteria, however, I can not get the desired return. The picture I attached is some what an idea of a table I am working with. I need to return different claim numbers with the following criteria being the same:
Sum(billed), Diagnosis_code, Rev_code, Cpt_Code, POS_Code, Member_ID, Provider_ID, Organization_ID, DOS, Rendering_Provider_ID.
Those criteria need to match exactly and the may not follow the same ascending or descending order as shown in the table. Here is the screen shot
I only want claim_no 101 and 102 to return because they have different claim numbers but match everything else. I do not want claim_no 103 because it does not match all of the above criteria.
I work with SQL Server 2012. Don't know if it maters but DOS data type is datetime. Any help will be greatly appreciated. Thanks.
If you want rows that match another row, you can do something like this:
select t.*
from t
where exists (select 1
from t t2
where t2.claim_no <> t.claim_no and
t2.Diagnosis_code = t.Diagnosis_code and
t2.Rev_code = t.Rev_code and
. . .
);
Fill in the . . . with the conditions that you want.
As per your sample data, below query will be work. If you want to add filter condition you can append with this query.
Select Clime_no, Sum(billed), Member_ID, Provider_ID, Organization_ID, DOS, Rendering_Provider_ID
from table_name
group by Clime_no, Member_ID, Provider_ID, Organization_ID, DOS, Rendering_Provider_ID

Using multiple columns with counts in access database designer

I am trying to display several columns with different counts in a microsoft access query. It doesn't let me do certain things a normal query can b/c it has the sql design view.
I'd like to display
multiple single etc columns with their counts.
Note: the table names and attributes have been changed.
select (select count(*)as multiple from (select userId from dbo.Purchases
where userId is not null GRoup by userId having count(*)>1) x), (
select count(*)as single from (select userId from dbo.Purchases where
userId is not null GRoup by userId having count(*)=1) x );
if I do these separately I can display it, but I'd like to combine them into one query and one row. Is this possible?
select count(*)as multiple from (select userId from dbo.Purchases
where userId is not null GRoup by userId having count(*)>1) x)
It's very easy with 2 queries:
First one, saved as "Purchases Summary"
Select UserID, count(UserID) as Count from Purchases Group By UserID
With a 2nd built on it:
SELECT Sum(IIf([count]=1,1)) AS [Single], Sum(IIf([count]>1,1)) AS Multiple FROM [Purchases Summary]
I cannot find a clever way to combine this into a single query.
I don't know what my problem last night was, but the single query is
SELECT Sum(IIf([count]=1,1)) AS [Single], Sum(IIf([count]>1,1)) AS Multiple
FROM (Select UserID, count(UserID) as Count from Purchases Group By UserID)
Don George solution also would work.
I ended up using a form and VBA for each column. An issue I had was that I needed to use a distinct call for unique IDs and when there is a sql design view that's not really supported. distinctrow is supported, but it would not work for my query. I ended up writing it as before so that it did not need distinct.
This is the VBA I used to override each input inside of an access form. The currentDb needs to be connected to the database properly before it will also work.
selectStatement = "SELECT Count(* ) FROM (SELECT userID FROM dbo_Purchases WHERE userID is not null GROUP BY userID HAVING count(*)>1) AS x;"
rs = CurrentDb.OpenRecordset(selectStatement).Fields(0).Value
[Text30].Value = rs

Find row number in a sort based on row id, then find its neighbours

Say that I have some SELECT statement:
SELECT id, name FROM people
ORDER BY name ASC;
I have a few million rows in the people table and the ORDER BY clause can be much more complex than what I have shown here (possibly operating on a dozen columns).
I retrieve only a small subset of the rows (say rows 1..11) in order to display them in the UI. Now, I would like to solve following problems:
Find the number of a row with a given id.
Display the 5 items before and the 5 items after a row with a given id.
Problem 2 is easy to solve once I have solved problem 1, as I can then use something like this if I know that the item I was looking for has row number 1000 in the sorted result set (this is the Firebird SQL dialect):
SELECT id, name FROM people
ORDER BY name ASC
ROWS 995 TO 1005;
I also know that I can find the rank of a row by counting all of the rows which come before the one I am looking for, but this can lead to very long WHERE clauses with tons of OR and AND in the condition. And I have to do this repeatedly. With my test data, this takes hundreds of milliseconds, even when using properly indexed columns, which is way too slow.
Is there some means of achieving this by using some SQL:2003 features (such as row_number supported in Firebird 3.0)? I am by no way an SQL guru and I need some pointers here. Could I create a cached view where the result would include a rank/dense rank/row index?
Firebird appears to support window functions (called analytic functions in Oracle). So you can do the following:
To find the "row" number of a a row with a given id:
select id, row_number() over (partition by NULL order by name, id)
from t
where id = <id>
This assumes the id's are unique.
To solve the second problem:
select t.*
from (select id, row_number() over (partition by NULL order by name, id) as rownum
from t
) t join
(select id, row_number() over (partition by NULL order by name, id) as rownum
from t
where id = <id>
) tid
on t.rownum between tid.rownum - 5 and tid.rownum + 5
I might suggest something else, though, if you can modify the table structure. Most databases offer the ability to add an auto-increment column when a row is inserted. If your records are never deleted, this can server as your counter, simplifying your queries.