I've got three tables:
Lessons:
CREATE TABLE lessons (
id SERIAL PRIMARY KEY,
title text NOT NULL,
description text NOT NULL,
vocab_count integer NOT NULL
);
+----+------------+------------------+-------------+
| id | title | description | vocab_count |
+----+------------+------------------+-------------+
| 1 | lesson_one | this is a lesson | 3 |
| 2 | lesson_two | another lesson | 2 |
+----+------------+------------------+-------------+
Lesson_vocabulary:
CREATE TABLE lesson_vocabulary (
lesson_id integer REFERENCES lessons(id),
vocabulary_id integer REFERENCES vocabulary(id)
);
+-----------+---------------+
| lesson_id | vocabulary_id |
+-----------+---------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 2 |
| 2 | 4 |
+-----------+---------------+
Vocabulary:
CREATE TABLE vocabulary (
id integer PRIMARY KEY,
hiragana text NOT NULL,
reading text NOT NULL,
meaning text[] NOT NULL
);
Each lesson contains multiple vocabulary, and each vocabulary can be included in multiple lessons.
How can I get the vocab_count column of the lessons table to be calculated and updated whenevr I add more rows to the lesson_vocabulary table. Is this possible, and how would I go about doing this?
Thanks
You can use SQL triggers to serve your purpose. This would be similar to mysql after insert trigger which updates another table's column.
The trigger would look somewhat like this. I am using Oracle SQL, but there would just be minor tweaks for any other implementation.
CREATE TRIGGER vocab_trigger
AFTER INSERT ON lesson_vocabulary
FOR EACH ROW
begin
for lesson_cur in (select LESSON_ID, COUNT(VOCABULARY_ID) voc_cnt from LESSON_VOCABULARY group by LESSON_ID) LOOP
update LESSONS
set VOCAB_COUNT = LESSON_CUR.VOC_CNT
where id = LESSON_CUR.LESSON_ID;
end loop;
END;
It's better to create a view that calculates that (and get rid of the column in the lessons table):
select l.*, lv.vocab_count
from lessons l
left join (
select lesson_id, count(*)
from lesson_vocabulary
group by lesson_id
) as lv(lesson_id, vocab_count) on l.id = lv.lesson_id
If you really want to update the lessons table each time the lesson_vocabulary changes, you can run an UPDATE statement like this in a trigger:
update lessons l
set vocab_count = t.cnt
from (
select lesson_id, count(*) as cnt
from lesson_vocabulary
group by lesson_id
) t
where t.lesson_id = l.id;
I would recommend using a query for this information:
select l.*,
(select count(*)
from lesson_vocabulary lv
where lv.lesson_id = l.lesson_id
) as vocabulary_cnt
from lessons l;
With an index on lesson_vocabulary(lesson_id), this should be quite fast.
I recommend this over an update, because the data remains correct.
I recommend this over a trigger, because it is simpler.
I recommend this over a subquery with aggregation because it should be faster, particularly if you are filtering on the lessons table.
SQL Server 2016
I have a number of tables
Table A Table B Table C Table D
User | DataA User | DataB User | DataC User | DataD
=========== =========== =================== =============
1 | 10 1 | 'hello' 4 | '2020-01-01' 1 | 0.34
2 | 20 2 | 'world'
3 | 30
So some users have data for A,B,C and/or D.
Table UserEnabled
User | A | B | C | D
=============================
1 | 1 | 1 | 0 | 0
2 | 1 | 1 | 0 | 0
3 | 1 | 0 | 0 | 0
4 | 0 | 0 | 1 | 0
Table UserEnabled indicates whether we are interested in any of the data in the corresponding tables A,B,C and/or D.
Now I want to join those tables on User but I do only want the columns where the UserEnabled table has at least one user with a 1 (ie at least one user enabled). Ideally I only want to join the tables that are enabled and not filter the columns from the disabled tables afterwards.
So as a result for all users I would get
User | DataA | DataB | DataC
===============================
1 | 10 | 'hello' | NULL
2 | 20 | 'world' | NULL
3 | 30 | NULL | NULL
4 | NULL | NULL | '2020-01-01'
No user has D enabled so it does not show up in a query
I was going to come up with a dynamic SQL that's built every time I execute the query depending on the state of UserEnabled but I'm afraid this is going to perform poorly on a huge data set as the execution plan will need to be created every time. I want to dynamically display only the enabled data, not columns with all NULL.
Is there another way?
Usage will be a data sheet that may be generated up to a number of times per minute.
You have no choice but to approach this through dynamic SQL. A select query has a fixed set of columns defined when the query is created. No such thing as "variable" columns.
What can you do? One method is to "play a trick". Store the columns as JSON (or XML) and delete the empty columns.
Another method is to create a view that has the specific logic you need. I think you can maintain this view by altering it in a trigger, based on when data in the enabled table changes. That said, altering the view requires dynamic SQL so the code will not be pretty.
Just because I thought this could be fun.
Example
Declare #Col varchar(max) = ''
Declare #Src varchar(max) = ''
Select #Col = #Col+','+Item+'.[Data'+Item+']'
,#Src = #Src+'Left Join [Table'+Item+'] '+Item+' on U.[User]=['+Item+'].[User] and U.['+Item+']=1'+char(13)
From (
Select Item
From ( Select A=max(A)
,B=max(B)
,C=max(C)
,D=max(D)
From UserEnabled
Where 1=1 --<< Use any Key Inital Filter Condition Here
) A
Unpivot ( value for item in (A,B,C,D)) B
Where Value=1
) A
Declare #SQL varchar(max) = '
Select U.[User]'+#Col+'
From #UserEnabled U
'+#Src
--Print #SQL
Exec(#SQL)
Returns
User DataA DataB DataC
1 10 Hello NULL
2 20 World NULL
3 30 NULL NULL
4 NULL NULL 2020-01-01
The Generated SQL
Select A.[User],A.[DataA],B.[DataB],C.[DataC]
From UserEnabled U
Left Join TableA A on U.[User]=[A].[User] and U.[A]=1
Left Join TableB B on U.[User]=[B].[User] and U.[B]=1
Left Join TableC C on U.[User]=[C].[User] and U.[C]=1
If all the relations are 1:1, you can make one query with
...
FROM u
LEFT JOIN a ON u.id = a.u_id
LEFT JOIN b ON u.id = b.u_id
LEFT JOIN c ON u.id = c.u_id
LEFT JOIN d ON u.id = d.u_id
...
and use display logic on the client to omit the irrelevant columns.
If more than one relation is 1:N, then you'd likely have to do multiple queries anyway to prevent N1xN2 results.
I have a query where I send a TableType who have columns EmpKey and TaskId like:
#AssignNotificationTableType [dbo].[udf_TaskNotification] READONLY
INSERT INTO [TaskNotification] ([TaskId], [EmpKey])
SELECT
[ANT].[TaskId], [E].[EmpKey]
FROM
#AssignNotificationTableType AS [ANT]
INNER JOIN
[Employee] AS [E] ON [ANT].[EmpGuid] = [E].[EmpGuid]
So my table looks like this:
+--------------------------------------+--------------------------------------+--------+
| TaskNotificationId | TaskId | EmpKey |
+--------------------------------------+--------------------------------------+--------+
| EEE3D3F8-F190-E811-841F-C81F66DACA6A | D0440DEB-404C-4006-870F-E95BFFA840E0 | 44 |
| EFE3D3F8-F190-E811-841F-C81F66DACA6A | D0440DEB-404C-4006-870F-E95BFFA840E0 | 49 |
+--------------------------------------+--------------------------------------+--------+
As you can see two items have same TaskId but different Empkey, so suppose if I send again same TaskId D0440DEB-404C-4006-870F-E95BFFA840E0 I want to insert only row only if EmpKey does not exist in that TaskId
So if I send something like:
+--+--------------------------------------+--------+
| | TaskId | EmpKey |
+--+--------------------------------------+--------+
| | D0440DEB-404C-4006-870F-E95BFFA840E0 | 44 |
| | D0440DEB-404C-4006-870F-E95BFFA840E0 | 49 |
| | D0440DEB-404C-4006-870F-E95BFFA840E0 | 54 |
+--+--------------------------------------+--------+
It will only insert last row, because EmpKey 54 does not exist in that TaskId
I try to do in WHERE clause with NOT IN as:
INSERT INTO [TaskNotification] ([TaskId], [EmpKey])
SELECT
[ANT].[TaskId], [E].[EmpKey]
FROM
#AssignNotificationTableType AS [ANT]
INNER JOIN
[Employee] AS [E] ON [ANT].[EmpGuid] = [E].[EmpGuid]
WHERE
[E].[EmpKey] NOT IN (SELECT EmpKey
FROM [TaskNotification]
WHERE TaskId = (SELECT TaskId
FROM #AssignNotificationTableType))
But when I run it, it just don't insert anything. What am I doing wrong? Regards
Add the target table as a left join to the select statement:
INSERT INTO [TaskNotification]
(
[TaskId]
, [EmpKey]
)
SELECT
[ANT].[TaskId]
, [E].[EmpKey]
FROM #AssignNotificationTableType AS [ANT]
INNER JOIN [Employee] AS [E] ON [ANT].[EmpGuid] = [E].[EmpGuid]
LEFT JOIN [TaskNotification] AS [TN] ON [TN].[TaskId] = [ANT].[TaskId]
AND [TN].[EmpKey] = [E].[EmpKey]
WHERE [TN].[PK] IS NULL -- PK stands for the primary key column
-- (or first column in of a multiple columns pk)
Please note, however, this in a multithreaded environment such query might fail - For more information, read this SO post and Dan Guzman's blog post it links to.
A brief explanation on the relevant domain part:
A Category is composed of four data:
Gender (Male/Female)
Age Division (Mighty Mite to Master)
Belt Color (White to Black)
Weight Division (Rooster to Heavy)
So, Male Adult Black Rooster forms one category. Some combinations may not exist, such as mighty mite black belt.
An Athlete fights Athletes of the same Category, and if he classifies, he fights Athletes of different Weight Divisions (but of the same Gender, Age and Belt).
To the modeling. I have a Category table, already populated with all combinations that exists in the domain.
CREATE TABLE Category (
[Id] [int] IDENTITY(1,1) NOT NULL,
[AgeDivision_Id] [int] NULL,
[Gender] [int] NULL,
[BeltColor] [int] NULL,
[WeightDivision] [int] NULL
)
A CategorySet and a CategorySet_Category, which forms a many to many relationship with Category.
CREATE TABLE CategorySet (
[Id] [int] IDENTITY(1,1) NOT NULL,
[Championship_Id] [int] NOT NULL,
)
CREATE TABLE CategorySet_Category (
[CategorySet_Id] [int] NOT NULL,
[Category_Id] [int] NOT NULL
)
Given the following result set:
| Options_Id | Championship_Id | AgeDivision_Id | BeltColor | Gender | WeightDivision |
|------------|-----------------|----------------|-----------|--------|----------------|
1. | 2963 | 422 | 15 | 7 | 0 | 0 |
2. | 2963 | 422 | 15 | 7 | 0 | 1 |
3. | 2963 | 422 | 15 | 7 | 0 | 2 |
4. | 2963 | 422 | 15 | 7 | 0 | 3 |
5. | 2964 | 422 | 15 | 8 | 0 | 0 |
6. | 2964 | 422 | 15 | 8 | 0 | 1 |
7. | 2964 | 422 | 15 | 8 | 0 | 2 |
8. | 2964 | 422 | 15 | 8 | 0 | 3 |
Because athletes may fight two CategorySets, I need CategorySet and CategorySet_Category to be populated in two different ways (it can be two queries):
One Category_Set for each row, with one CategorySet_Category pointing to the corresponding Category.
One Category_Set that groups all WeightDivisions in one CategorySet in the same AgeDivision_Id, BeltColor, Gender. In this example, only BeltColor varies.
So the final result would have a total of 10 CategorySet rows:
| Id | Championship_Id |
|----|-----------------|
| 1 | 422 |
| 2 | 422 |
| 3 | 422 |
| 4 | 422 |
| 5 | 422 |
| 6 | 422 |
| 7 | 422 |
| 8 | 422 |
| 9 | 422 | /* groups different Weight Division for BeltColor 7 */
| 10 | 422 | /* groups different Weight Division for BeltColor 8 */
And CategorySet_Category would have 16 rows:
| CategorySet_Id | Category_Id |
|----------------|-------------|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
| 7 | 7 |
| 8 | 8 |
| 9 | 1 | /* groups different Weight Division for BeltColor 7 */
| 9 | 2 | /* groups different Weight Division for BeltColor 7 */
| 9 | 3 | /* groups different Weight Division for BeltColor 7 */
| 9 | 4 | /* groups different Weight Division for BeltColor 7 */
| 10 | 5 | /* groups different Weight Division for BeltColor 8 */
| 10 | 6 | /* groups different Weight Division for BeltColor 8 */
| 10 | 7 | /* groups different Weight Division for BeltColor 8 */
| 10 | 8 | /* groups different Weight Division for BeltColor 8 */
I have no idea how to insert into CategorySet, grab it's generated Id, then use it to insert into CategorySet_Category
I hope I've made my intentions clear.
I've also created a SQLFiddle.
Edit 1: I commented in Jacek's answer that this would run only once, but this is false. It will run a couple of times a week. I have the option to run as SQL Command from C# or a stored procedure. Performance is not crucial.
Edit 2: Jacek suggested using SCOPE_IDENTITY to return the Id. Problem is, SCOPE_IDENTITY returns only the last inserted Id, and I insert more than one row in CategorySet.
Edit 3: Answer to #FutbolFan who asked how the FakeResultSet is retrieved.
It is a table CategoriesOption (Id, Price_Id, MaxAthletesByTeam)
And tables CategoriesOptionBeltColor, CategoriesOptionAgeDivision, CategoriesOptionWeightDivison, CategoriesOptionGender. Those four tables are basically the same (Id, CategoriesOption_Id, Value).
The query look like this:
SELECT * FROM CategoriesOption co
LEFT JOIN CategoriesOptionAgeDivision ON
CategoriesOptionAgeDivision.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionBeltColor ON
CategoriesOptionBeltColor.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionGender ON
CategoriesOptionGender.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionWeightDivision ON
CategoriesOptionWeightDivision.CategoriesOption_Id = co.Id
The solution described here will work correctly in multi-user environment and when destination tables CategorySet and CategorySet_Category are not empty.
I used schema and sample data from your SQL Fiddle.
First part is straight-forward
(ab)use MERGE with OUTPUT clause.
MERGE can INSERT, UPDATE and DELETE rows. In our case we need only to INSERT. 1=0 is always false, so the NOT MATCHED BY TARGET part is always executed. In general, there could be other branches, see docs. WHEN MATCHED is usually used to UPDATE; WHEN NOT MATCHED BY SOURCE is usually used to DELETE, but we don't need them here.
This convoluted form of MERGE is equivalent to simple INSERT, but unlike simple INSERT its OUTPUT clause allows to refer to the columns that we need.
MERGE INTO CategorySet
USING
(
SELECT
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,Category.Id AS Category_Id
FROM
FakeResultSet
INNER JOIN Category ON
Category.AgeDivision_Id = FakeResultSet.AgeDivision_Id AND
Category.Gender = FakeResultSet.Gender AND
Category.BeltColor = FakeResultSet.BeltColor AND
Category.WeightDivision = FakeResultSet.WeightDivision
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(Championship_Id
,Price_Id
,MaxAthletesByTeam)
VALUES
(Src.Championship_Id
,Src.Price_Id
,Src.MaxAthletesByTeam)
OUTPUT inserted.id AS CategorySet_Id, Src.Category_Id
INTO CategorySet_Category (CategorySet_Id, Category_Id)
;
FakeResultSet is joined with Category to get Category.id for each row of FakeResultSet. It is assumed that Category has unique combinations of AgeDivision_Id, Gender, BeltColor, WeightDivision.
In OUTPUT clause we need columns from both source and destination tables. The OUTPUT clause in simple INSERT statement doesn't provide them, so we use MERGE here that does.
The MERGE query above would insert 8 rows into CategorySet and insert 8 rows into CategorySet_Category using generated IDs.
Second part
needs temporary table. I'll use a table variable to store generated IDs.
DECLARE #T TABLE (
CategorySet_Id int
,AgeDivision_Id int
,Gender int
,BeltColor int);
We need to remember the generated CategorySet_Id together with the combination of AgeDivision_Id, Gender, BeltColor that caused it.
MERGE INTO CategorySet
USING
(
SELECT
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,FakeResultSet.AgeDivision_Id
,FakeResultSet.Gender
,FakeResultSet.BeltColor
FROM
FakeResultSet
GROUP BY
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,FakeResultSet.AgeDivision_Id
,FakeResultSet.Gender
,FakeResultSet.BeltColor
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(Championship_Id
,Price_Id
,MaxAthletesByTeam)
VALUES
(Src.Championship_Id
,Src.Price_Id
,Src.MaxAthletesByTeam)
OUTPUT
inserted.id AS CategorySet_Id
,Src.AgeDivision_Id
,Src.Gender
,Src.BeltColor
INTO #T(CategorySet_Id, AgeDivision_Id, Gender, BeltColor)
;
The MERGE above would group FakeResultSet as needed and insert 2 rows into CategorySet and 2 rows into #T.
Then join #T with Category to get Category.IDs:
INSERT INTO CategorySet_Category (CategorySet_Id, Category_Id)
SELECT
TT.CategorySet_Id
,Category.Id AS Category_Id
FROM
#T AS TT
INNER JOIN Category ON
Category.AgeDivision_Id = TT.AgeDivision_Id AND
Category.Gender = TT.Gender AND
Category.BeltColor = TT.BeltColor
;
This will insert 8 rows into CategorySet_Category.
Here is not the full answer, but direction which you can use to solve this:
1st query:
select row_number() over(order by t, Id) as n, Championship_Id
from (
select distinct 0 as t, b.Id, a.Championship_Id
from FakeResultSet as a
inner join
Category as b
on
a.AgeDivision_Id=b.AgeDivision_Id and
a.Gender=b.Gender and
a.BeltColor=b.BeltColor and
a.WeightDivision=b.WeightDivision
union all
select distinct 1, BeltColor, Championship_Id
from FakeResultSet
) as q
2nd query:
select q2.CategorySet_Id, c.Id as Category_Id from (
select row_number() over(order by t, Id) as CategorySet_Id, Id, BeltColor
from (
select distinct 0 as t, b.Id, null as BeltColor
from FakeResultSet as a
inner join
Category as b
on
a.AgeDivision_Id=b.AgeDivision_Id and
a.Gender=b.Gender and
a.BeltColor=b.BeltColor and
a.WeightDivision=b.WeightDivision
union all
select distinct 1, BeltColor, BeltColor
from FakeResultSet
) as q
) as q2
inner join
Category as c
on
(q2.BeltColor is null and q2.Id=c.Id)
OR
(q2.BeltColor = c.BeltColor)
of course this will work only for empty CategorySet and CategorySet_Category tables, but you can use select coalese(max(Id), 0) from CategorySet to get current number and add it to row_number, thus you will get real ID which will be inserted into CategorySet row for second query
What I do when I run into these situations is to create one or many temporary tables with row_number() over clauses giving me identities on the temporary tables. Then I check for the existence of each record in the actual tables, and if they exist update the temporary table with the actual record ids. Finally I run a while exists loop on the temporary table records missing the actual id and insert them one at a time, after the insert I update the temporary table record with the actual ids. This lets you work through all the data in a controlled manner.
##IDENTITY is your friend to the 2nd part of question
https://msdn.microsoft.com/en-us/library/ms187342.aspx
and
Best way to get identity of inserted row?
Some API (drivers) returns int from update() function, i.e. ID if it is "insert". What API/environment do You use?
I don't understand 1st problem. You should not insert identity column.
Below query will give final result For CategorySet rows:
SELECT
ROW_NUMBER () OVER (PARTITION BY Championship_Id ORDER BY Championship_Id) RNK,
Championship_Id
FROM
(
SELECT
Championship_Id
,BeltColor
FROM #FakeResultSet
UNION ALL
SELECT
Championship_Id,BeltColor
FROM #FakeResultSet
GROUP BY Championship_Id,BeltColor
)BASE
I have been searching around for how to do this for days - unfortunately I don't have much experience with SQL Queries, so it's been a bit of trial and error.
Basically, I have created two tables - both with one DateTime column and a different column with values in.
The DateTime column has different values in each table.
So...
ACOQ1 (Table 1)
===============
| DateTime | ACOQ1_Pump_Running |
|----------+--------------------|
| 7:14:12 | 1 |
| 8:09:03 | 1 |
ACOQ2 (Table 2)
===============
| DateTime | ACOQ2_Pump_Running |
|----------+--------------------|
| 3:54:20 | 1 |
| 7:32:57 | 1 |
I want to combine these two tables to look like this:
| DateTime | ACOQ1_Pump_Running | ACOQ2_Pump_Running |
|----------+--------------------+--------------------|
| 3:54:20 | 0 OR NULL | 1 |
| 7:14:12 | 1 | 0 OR NULL |
| 7:32:57 | 0 OR NULL | 1 |
| 8:09:03 | 1 | 0 OR NULL |
I have achieved this by creating a third table that 'UNION's the DateTime column from both tables and then uses that third table's DateTime column for the new table but was wondering if there was a way to skip this step out.
(Eventually I will be adding more and more columns on from different tables and don't really want to be adding yet more processing time by creating a joint DateTime table that may not be necessary).
My working code at the moment:
CREATE TABLE JointDateTime
(
DateTime CHAR(50)
CONSTRAINT [pk_Key3] PRIMARY KEY (DateTime)
);
INSERT INTO JointDateTime (DateTime)
SELECT ACOQ1.DateTime FROM ACOQ1
UNION
SELECT ACOQ2.DateTime FROM ACOQ2
SELECT JointDateTime.DateTime, ACOQ1.ACOQ1_NO_1_PUMP_RUNNING, ACOQ2.ACOQ2_NO_1_PUMP_RUNNING
FROM (SELECT ACOQ1.DateTime FROM ACOQ1
UNION
SELECT ACOQ2.DateTime FROM ACOQ2) JointDateTime
LEFT OUTER JOIN ACOQ1
ON JointDateTime.DateTime = ACOQ1.DateTime
LEFT OUTER JOIN ACOQ2
ON JointDateTime.DateTime = ACOQ2.DateTime
You need a plain old FULL OUTER JOIN like this.
SELECT COALESCE(A1.DateTime,A2.DateTime) DateTime,ACOQ1_Pump_Running, ACOQ2_Pump_Running
FROM ACOQ1 A1
FULL OUTER JOIN ACOQ2 A2
ON A1.DateTime = A2.DateTime
This will give you NULL for ACOQ1_Pump_Running, ACOQ2_Pump_Running for rows which do not match the date in the corresponding table. If you need 0 just use COALESCE or ISNULL.
Side Note: : In your script, I can see your are using DateTime CHAR(50). Please use appropriate types