SQL query for category hierarchy validations - sql

I need to add validation on category creation.
CASE 1: parentId should be valid if supplied
CASE 2: name of sibling could not be duplicated
I have this table categories:
id | parentId | name
-----|-----------|------
1 | NULL | CatA
2 | 1 | CatA.1
(Note: My parent child hierarchy can go up-to nth level)
Now in the above scenario what should not be allowed is:
I cannot provide an invalid parentId
I cannot create a category with name: CatA where parentId = null
I cannot create a category where name: CatA.1 where parentId = 1
Now I am in a nodejs so I need to return these 2 validations errors:
The provided parentId is invalid
Duplicate name detected
Now I want to achieve this using a single optimized SQL query.
I can use if else statements later on the base of query response.
But for me it is really important that I use single query and that query should be as optimized as possible.
What I tried so far is:
SELECT
TOP 1 parentId,
name,
(
CASE
WHEN name = 'CatA.2' THEN 1
ELSE 0
) sortOrder
FROM
catagories
WHERE
parentId = 1
ORDER BY
sortOrder DESC
Now the issue with my query is that it doesn't cover all the scenarios.
Can anyone help me with the query?

The problem with single query is that you have two cases which have different validation needs:
Provided parent_id is null
Provided parent_id is not null
Isn't it easier to write two queries and call the correct one from nodejs?
Query 1: Select rows where parent_id is null and name matches passed name. If it doesn't return any rows then all is OK, otherwise return error. Note that code should be parent_id IS NULL and not parent_id = NULL
Query 2: The query you wrote. If it doesn't return, then parent_id is invalid. If it returns and sortorder = 1 then you have a duplicate, otherwise all is well

Related

Updating uniqueidentifier column with same value for rows with matching column value

I need a little help. I have this (simplified) table:
ID
Title
Subtype
RelatedUniqueID
1
My Title 1
1
NULL
2
My Title 2
1
NULL
3
My Title 3
2
NULL
4
My Title 4
2
NULL
5
My Title 5
2
NULL
6
My Title 6
3
NULL
What I am trying to accomplish is generating the same uniqueidentifier for all rows having the same subtype.
So result would be this:
ID
Title
Subtype
RelatedUniqueID
1
My Title 1
1
439753d3-9103-4d0e-9dd0-569dc71fd6a3
2
My Title 2
1
439753d3-9103-4d0e-9dd0-569dc71fd6a3
3
My Title 3
2
d0f08203-1197-4cc7-91bb-c4ca34d7cb0a
4
My Title 4
2
d0f08203-1197-4cc7-91bb-c4ca34d7cb0a
5
My Title 5
2
d0f08203-1197-4cc7-91bb-c4ca34d7cb0a
6
My Title 6
3
055838c6-a814-4bd1-a859-63d4544bb449
Requirements
One query to update all rows at once
The actual table has many more rows with hundreds of subtypes, so manually building a query for each subtype is not an option
Using SQL Server 2017
Thanks for any assist.
Because newid() is applied per-row, you have to generate the values first, so this has to involve the use of a temporary or permanent table to store the correlated ID>Subtype value.
So first you need to generate the GUID values per Subtype :
with subtypes as (
select distinct subtype
from t
)
select Subtype, NewId() RelatedId into #Id
from subtypes
And then you can use an updatable CTE to apply these to your base table:
with r as (
select t.*, id.RelatedId
from #id id
join t on t.subtype=id.Subtype
)
update r
set relatedUniqueId=RelatedId
See example DB<>Fiddle
You can use an updatable CTE with a window function to get this data:
with r as (
select t.*,
RelatedId = first_value(newid()) over (partition by t.Subtype order by ID rows unbounded preceding)
from t
)
update r
set relatedUniqueId = RelatedId;
db<>fiddle
I warn though, that newid() is somewhat unpredictable in when it is calculated, so don't try messing about with a joined update (unless you pre-save the IDs like #Stu has done).
For example, see this fiddle, the IDs were calculated differently for every row.
I have found the single query solution.
Pre-requirement for this to work is that RelatedUniqueID must already contain random values. (e.g. set default field value to newid)
UPDATE TestTable SET ForeignUniqueID = TG.ForeignUniqueID FROM TestTable TG INNER JOIN TestTable ON TestTable.SubType = TG.SubType
Update
As Stu mentions in the comments, this solution might affect performance on large datasets. Please keep that in mind.

Single query to return rows where field has highest number

I have the following query:
SELECT statement, value, level
FROM records
WHERE user_id=10 AND value IS NOT NULL and disabled IS NULL
It returns the results like the example below:
statement one | 1 | 3
statement two | 1 | 3
statement three | 0.5 | 4
statement four | 0.5 | 4
The last value is the value from the level field and I want to select only results for the highest number value. So in this case, the highest number is 4, so I want it to return statement three and four results.
I thought of first querying the highest number and then making another query to include an AND level=4 syntax.
Just wondering if its possible to do it as a single SQL query.
You could use rank() to find the hightest level:
SELECT statement, value, level
FROM (SELECT statement, value, level, RANK() OVER (ORDER BY level DESC) AS rk
FROM records
WHERE user_id = 10 AND value IS NOT NULL AND disabled IS NULL)
WHERE rk = 1
with max_level as (
select max(level) as max_level
from records
)
select statement, "value", level
from records
where
user_id = 10 and
"value" is not null and
disabled is null and
level = (select max_level from max_level)
This uses the overall highest value of level as condition.
SELECT statement, "value", level
FROM records
WHERE user_id = 10
AND "value" IS NOT NULL
AND disabled IS NULL
AND level = (SELECT max(level) FROM records);
This is basically a simplification of Clodoaldo's answer. The subquery expression is just fine, since it returns a single value.
Your example actually uses the highest value of level in the result as condition. Mureinik's answer has an elegant solution for that.
Aside: value is a reserved word in standard SQL. Better not use the as identifier, even if that's allowed in Postgres.

Update table row with certain id while deleting the recurrent row

I have 2 tables
Table name: Attributes
attribute_id | attribute_name
1 attr_name_1
2 attr_name_2
3 attr_name_1
4 attr_name_2
Table name: Products
product_id | product_name | attribute_id
1 prod_name_1 1
2 prod_name_2 2
3 prod_name_3 3
4 prod_name_4 4
If you can see, attribute_id in the table Products has the following id's (1,2,3,4), instead of (1,2,1,2).
The problem is in the table Attributes, namely, there are repeating values(attribute_names) with different ID, so I want:
To pick One ID of the repeating, from the table Attributes
Update the table Products with that "picked" ID(only in cases that attribute_id has same name in the table Attributes)
And after that, delete the repeating values from the table Attributes witch has no use in the table Products
Output:
Table name: Attributes
attribute_id | attribute_name
1 attr_name_1
2 attr_name_2
Table name: Products
product_id | product_name | attribute_id
1 prod_name_1 1
2 prod_name_2 2
3 prod_name_3 1
4 prod_name_4 2
Demo on SQLFiddle
Note:
it will help me a lot if i use sql instead fixing this issue manually.
update Products
set attribute_id = (
select min(attribute_id)
from Attributes a
where a.attribute_name=(select attribute_name from Attributes a2 where a2.attribute_id=Products.attribute_id)
);
DELETE
FROM Attributes
WHERE attribute_id NOT IN
(
SELECT MIN(attribute_id)
FROM Attributes
GROUP BY attribute_name
);
The following may be faster than #Alexander Sigachov's suggestion, but it does require at least SQL Server 2005 to run it, while Alexander's solution would work on any (reasonable) version of SQL Server. Still, even if only for the sake of providing an alternative, here you go:
WITH Min_IDs AS (
SELECT
attribute_id,
min_attribute_id = MIN(attribute_id) OVER (PARTITION BY attribute_name)
FROM Attributes
)
UPDATE p
SET p.attribute_id = a.min_attribute_id
FROM Products p
JOIN Min_IDs a ON a.attribute_id = p.attribute_id
WHERE a.attribute_id <> a.min_attribute_id
;
DELETE FROM Attributes
WHERE attribute_id NOT IN (
SELECT attribute_id
FROM Products
WHERE attribute_id IS NOT NULL
)
;
The first statement's CTE returns a row set where every attribute_id is mapped to the minimum attribute_id for the same attribute_name. By joining to this mapping set, the UPDATE statement uses it to replace attribute_ids in the Products table.
When subsequently deleting from Attributes, it is enough just to check if Attributes.attribute_id is not found in the Products.attribute_id column, which is what the the second statement does. That is to say, grouping and aggregation, as in the other answer, is not needed at this point.
The WHERE attribute_id IS NOT NULL condition is added to the second query's subquery in case the column is nullable and may indeed contain NULLs. NULLs need to be filtered out in this case, or their presence would result in the NOT IN predicate's evaluation to UNKNOWN, which SQL Server would treat same as FALSE (and so no row would effectively be deleted). If there cannot be NULLs in Products.attribute_id, the condition may be dropped.

Sorting child elements after their parent element

I'm trying to implement a category table.
A simplified table description is like this
id -- name -- parent_id
assuming a sample data like
id - name - parent_id
1 test1 null
2 test2 null
3 test3 null
4 test4 1
5 test5 4
6 test6 2
7 test7 1
I'm struggling to come up with an sql query that will return the record set in the following order
id - name - parent_id
1 test1 null
4 test4 1
5 test5 4
7 test7 1
2 test2 null
6 test6 2
3 test3 null
Basically the child elements are returned after their parent element.
----------------------- SOLVED BY USING LINQ/recursion in code -------------------------
Not exactly an sql solution, but ultimately it works.
Based on what you are trying to do with the query, you don't need to sort it that way. You just need to ensure that the parents are created first. So run your query sorted by parent ID, put the result into an array and loop over that array. On each iteration do a check to make sure parent exists, if it has a parent. If parent doesn't exist, just move that item to the end of the array and go to the next for now, you should only end up with a few cases that get moved so it remains decently efficient.
What I have always done in the past is split the database up into the following (I'm not the best at SQL though so there may be some other solutions for you).
categories
- category_id | int(11) | Primary Key / Auto_Increment
..
..
sub_categories
- sub_category_id | int(11) | Primary Key / Auto_Increment
- category_id | int(11) | Foreign Key to categories table
..
..
Here is what I would do:
SELECT id, name, parent_id, (CASE WHEN COALESCE(parentid,0)=0 THEN id ELSE (parentid + '.' + id)) as orderid
FROM table
ORDER BY (CASE WHEN COALESCE(parentid,0)=0 THEN id ELSE (parentid + '.' + id))
This should create a new column called orderid that has the parentid dot the id (1.4, 4.5, etc.) For the columns where the parent id is null, it would put just the id. This way you would get the order as 1, 1.4, 4, 4.5, etc.
Please check the code since I wrote this on the fly without testing. It should be close.
The query below works by added an extra order_parent column that contains either the parent id or the row's id, depending on whether it is the parent. It then just sorts primarily by the order_parent id to group them together, then by the parent_id to sort the nulls (actual parents) on top.
Two things:
This has one more column that you originally wanted, so just ignore it.
In case your database returns the nulls of parent_id last, add a DESC.
Good question, by the way!
SELECT id,
name,
parent_id,
(
case
when parent_id is null then id
else parent_id
end
) AS order_parent
FROM myTable
ORDER BY order_parent, parent_id

how to query with child relations to same table and order this correctly

Take this table:
id name sub_id
---------------------------
1 A (null)
2 B (null)
3 A2 1
4 A3 1
The sub_id column is a relation to his own table, to column ID.
subid --- 0:1 --- id
Now I have the problem to make a correctly SELECT query to show that the child rows (which sub_id is not null) directly selected under his parent row. So this must be a correctly order:
1 A (null)
3 A2 1
4 A3 1
2 B (null)
A normal SELECT order the id. But how or which keyword help me to order this correctly?
JOIN isn't possible I think because I want to get all the rows separated. Because the rows will be displayed on a Gridview (ASP.Net) with EntityDataSource but the child rows must be displayed directly under his parent.
Thank you.
Look at Managing Hierarchical Data in MySQL.
Since recursion is an expensive operation because basicly you're firing multiple queries to your database you could consider using the Nested Set Model. In short you're assigning numbers to ranges in your table. It's a long article but it worth reading it. I've used it during my internship as a solution not to have 1000+ queries, But bring it down to 1 query.
Your handling 'overhead' now lies at the point of updating the table by adding, updating or deleting records. Since you then have to update all the records with a bigger 'right-value'. But when you're retrieving the data, it all goes with 1 query :)
select * from table1 order by name, sub_id will in this case return your desired result but only because the parents names and the child name are similar. If you're using SQL 2005 a recursive CTE will work:
WITH recurse (id, Name, childID, Depth)
AS
(
SELECT id, Name, ISNULL(childID, id) as id, 0 AS Depth
FROM table1 where childid is null
UNION ALL
SELECT table1.id, table1.Name, table1.childID, recurse.Depth + 1 AS Depth FROM table1
JOIN recurse ON table1.childid = recurse.id
)
SELECT * FROM recurse order by childid, depth
SELECT
*
FROM
table
ORDER BY
COALESCE(id,sub_id), id
btw, this will work only for one level.. any thing more than that requires recursive/cte function