I have a User table which looks like below
UserID Name SponsorID (FK)
1 A null
2 B 1
3 C 1
4 D 3
The SponsorID refers to UserID. Now I need write a query which returns all user who is descendant of a given UserID.
Example
For UserID 1 the query returns all 4 users
For UserID 3 the query should return 1 user
The current implementation is getting the user list by looping each direct downline and I am looking for a better solution if it's possible.
UPDATE
Current code
public void findDownlineSponsorByUserBO(UserBO rootBO) throws Exception {
List<UserBO> downlines = businessOperationService.findUserBySponsorId(rootBO.getId(), "createdDate", false);
memberList.addAll(downlines);
for (UserBO memberBO : downlines) {
findDownlineSponsorByUserBO(memberBO);
}
}
You're going to have to use an iterative or recursive solution here, unless (perhaps) you're limited to one level of sponsor and can relate UserID to SponsorID in one join. You could load the table into a tree structure in memory, and then query that. Loading it would be O(nlogn), but then traversing it would be O(logn).
This other SO question might give you some useful ideas: Is it possible to query a tree structure table in MySQL in a single query, to any depth?
Related
I want to ask if its possible to use something like IN operator in Power BI RLS DAX.
The approach I am using now seems very convoluted to me, and I want to ask if it could be done better and simpler.
I have two types of identifiers in my Fact table: Code and GroupCode. Code is NOT NULL identifier and can have multiple unique GroupCodes and GroupCode can be NULL and can be assigned to multiple Code.
Example Fact table:
Code
GroupCode
Client
1
a
John
1
b
Susie
1
c
Mark
2
a
John
2
NULL
Mary
3
b
Susie
I want to create report where User enters by Code, but see all rows with it GroupCode.
In SQL it would be:
SELECT * FROM table
WHERE GroupCode IN (SELECT GroupCode FROM TABLE WHERE Code = '2') or Code = '2'
For now I created Dictionary table from my Fact table with Code and CodeDict, where CodeDict = ISNULL(GroupCode, Code). Also added CodeDict column to Fact table to create relation based on it, between Dict and Fact tables. Should be (1:*), but it shows (*:*).
Example Dict table:
Code
GroupCode
1
a
1
b
1
c
2
a
2
2
3
b
As for RLS rule it is set on Dict table as [Code] = #USERPRINCIPALNAME().
So for example if User open report using Code = 2. RLS will filter Dict table so CodeDict will be (a, 2) and by them, it will filter Fact table to shows rows with Code=(1, 2)
It's very convoluted approach and I don't like it, but I have no idea I could make it other way. I worry about (*:*) relation and that I have to create Bridge Dict table.
What do you think about this approach to this problem?
Is there a way to implement IN operator in RLS rule?
This might be a basic sql questions, however I was curious to know the answer to this.
I need to fetch top one record from the db. Which query would be more efficient, one with where clause or order by?
Example:
Table
Movie
id name isPlaying endDate isDeleted
Above is a versioned table for storing records for movie.
If the endDate is not null and isDeleted = 1 then the record is old and an updated one already exist in this table.
So to fetch the movie "Gladiator" which is currently playing, I can write a query in two ways:
1.
Select m.isPlaying
From Movie m
where m.name=:name (given)
and m.endDate is null and m.isDeleted=0
2. Select TOP 1 m.isPlaying
From Movie m
where m.name=:name (given)
order by m.id desc --- This will always give me the active record (one which is not deleted)
Which query is faster and the correct way to do it?
Update:
id is the only indexed column and id is the unique key. I am expecting the queries to return me only one result.
Update:
Examples:
Movie
id name isPlaying EndDate isDeleted
3 Gladiator 1 03/1/2017 1
4 Gladiator 1 03/1/2017 1
5 Gladiator 0 null 0
I would go with the where clause:
Select m.isPlaying
From Movie m
where m.id = :id and m.endDate is null and m.isDeleted = 0;
This can take advantage of an index on (id, isDeleted, endDate).
Also, the two are not equivalent. The second might return multiple rows when the first returns 1. Or the second might return one row when the first returns none.
The first option might return more than 1 row. Maybe you know it won't because you know what data you have stored but the SQL engine doesn't, and it will affect it's execution plan.
Considering that you only have 1 index and it's on the ID column, the 2nd query should be faster in theory, since it would do an index scan from the highest ID with a predicate for the given name, stopping at the first match.
The first query will do a full table scan while comparing column name, endDate and isDeleted, since it won't stop at the first result that matches.
Posting your execution plans for both queries might enlighten a few loose cables.
I have a large database and i'd like to pull info from a table (Term) where the Names are not linked to a PartyId for a certain SearchId. However:
There are multiple versions of the searches (sometimes 20-40 - otherwise I think SQL - Comparing two rows and two columns would work for me)
The PartyId will almost always be NULL for the first version for the search, and if the same Name for the same SearchId has a PartyId associated in a later version the NULL row should not appear in the results of the query.
I have 8 left joins to display the information requested - 3 of them are joined on the Term table
A very simplified sample of data is below
CASE statement? Join the table with itself for comparison? A temp table or do I just return the fields I'm joining on and/or want to display?
Providing sample data that yields no expected result is not as useful as providing data that gives an expected result..
When asking a question start with defining the problem in plain English. If you can't you don't understand your problem well enough yet. Then define the tables which are involved in the problem (including the columns) and sample data; the SQL you've tried, and what you're expected result is using the data in your sample. Without this minimum information we make many guesses and even with that information we may have to make assumptions; but without a minimum verifiable example showing illustrating your question, helping is problematic.
--End soap box
I'm guessing you're after only the names for a searchID which has a NULL partyID for the highest SearchVerID
So if we eliminated ID 6 from your example data, then 'Bob' would be returned
If we added ID 9 to your sample data for name 'Harry' with a searchID of 2 and a searchVerID of 3 and a null partyID then 'Harry' too would be returned...
If my understanding is correct, then perhaps...
WITH CTE AS (
SELECT Name, Row_Number() over (partition by Name order by SearchVersID Desc)
FROM Term
WHERE SearchID = 2)
SELECT Name
FROM CTE
WHERE RN = 1
and partyID is null;
This assigns a row number (RN) to each name starting at 1 and increasing by one for each entry; for searchID's of 2. The highest searchversion will always have a RN of 1. Then we filter to include only those RN which are 1 and have a null partyID. This would result in only those names having a searchID of 2 the highest search version and a NULL partyID
Ok So I took the question a different way too..
If you simply want all the names not linked to a PartyID for a given search.
SELECT A.*
FROM TERM A
WHERE NOT EXISTS (SELECT 1
FROM TERM B
WHERE A.Name = B.Name
AND SearchID = 2) and partyID is not null)
AND searchID = 2
The above should return all term records associated to searchID 2
that have a partyId. This last method is the exists not exists and set logic I was talking about in comments.
I have a table which has a simple parent child structure
products:
- id
- product_id
- time_created
- ... a few other columns
It is a parent if product_id IS NULL. Product id behaves here like parent_id. Data inside looks like this:
id | product_id
1 NULL
2 1
3 1
4 NULL
4 4
This table is updated every night a new versions are added.
Every user is using a lot of these products but only one version. User is notified if new rows are added for an product_id.
He can stop using id:2 and start using id:3. An another user will continue using id:2 etc.
products table is updated every night and it grows pretty fast. There are around 500000 rows at the moment and every night adds around 20000, probably 5-7000000 changes (new rows) per year.
Is there a way to optimize this database/table structure? Should I change anything? Is it a problem to have so much data in one table?
Your question is not clear. The sample data is suggesting that the parent-child relationship is only one level deep. If so, this is not a particularly hard problem. You can create a query to look up the most recent product id for each product -- and I'm assuming this is the one with the maximum id:
select id, product_id,
max(id) over (partition by coalsesce(product_id, id)) as biggest_id
from table t;
This is then a lookup table, to get the biggest id. It would produce:
id | product_id | biggest_id
1 NULL 3
2 1 3
3 1 3
4 NULL 4
4 4 4
If your table has deeper hierarchies, you can solve the problem using recursive CTEs, or by doing the calculation when the table is updated.
I have a table in SQL Server that contains the following columns :
Id Name ParentId LevelOrder
8 vehicle 0 0/8/
9 car 8 0/8/9/
10 bike 8 0/8/10/
11 House 0 0/11/
...
This creates a tree.
Say that I have the LevelOrder 0/8/, this should return only the car and bike rows, but how do I handle this in SQL Server?
I have tried :
Select * FROM MyTable WHERE LevelOrder >= '0/8/'
but that does not work.
The underscore character will guarantee at least one character comes after '0/8/', so you don't get a match on the "vehicle" row.
SELECT *
FROM MyTable
WHERE LevelOrder LIKE '0/8/_%'
This code allows you to select values that start with 0/8/
Select * FROM MyTable WHERE LevelOrder like '0/8/%'
Okay -
While #Joe's answer is the simplest and easiest to implement (and possibly better performing than what I'm about to propose...), there are some issues with update anomalies.
Specifically:
You already have a parentId column. You need to synchronize both this and the levelOrder column, or risk inconsistent data. (I believe this also violates 1NF, although my understanding of the exact definition is a little sketchy...)
levelOrder contains the entire heirarchy. If any one parent is moved, all children rows must have levelOrder modified to reflect this (potentially very messy).
In light of this, here's what I recommend:
Drop the levelOrder column, as its existence will (generally) cause problems.
Use a recursive CTE and the parentId column to build the heirarchy dynamically. Either leave the column where it is, or move it to a dedicated relationship table. Moving one parent then requires only one cell to be updated, and cannot result in any (data, not semantic) anomalies. The CTE should look similar to this form (will need to be adjusted for purpose):
WITH heir_parent (parentId, id) as (SELECT parentId, id
FROM table
WHERE id =
UNION ALL
SELECT b.parentId, b.id
FROM heir_parent as a
JOIN table as b
ON b.parentId = a.id)
At the moment, the CTE returns a list of all children of the given id, with their id and their immediate parent. It can be adjusted to return a number of other things as well - although I recommend that the CTE be used only to generate the relationship, and join externally to get the remaining data.