How to combine two SQL queries in MySQL with different columns without combining their resulting rows - sql

Context
I'm trying to create a "feed" system on my website where users can go to their feed and see all their new notifications across the different things they can do on the website. For example, in the "feed" section, users are able to see if the users they follow have created articles and if the users have commented on articles. My current feed system simply uses two separate queries to obtain this information. However, I want to combine these two queries into one so that the user can view the activity of those they follow chronologically. The way my system works now, I get five articles from each person the user follows and put it in the "feed" section and then get five article comments and post it in the same area in the "feed" section. Instead of the queries being separate, I want to combine them so that, instead of seeing five article posts in a row and then five article comments in a row, they will see the feed posts that happened in chronological order, whether the other users created an article first, then commented, then created another article, or whatever the order is, instead of always seeing the same order of notifications.
Question
First, let me show you my code for table creation if you would like to recreate this. The first thing to do is to create a users table, which my articles and articlecomments tables reference:
CREATE TABLE users (
idUsers int(11) AUTO_INCREMENT PRIMARY KEY NOT NULL,
uidUsers TINYTEXT NOT NULL,
emailUsers VARCHAR(100) NOT NULL,
pwdUsers LONGTEXT NOT NULL,
created DATETIME NOT NULL,
UNIQUE (emailUsers),
FULLTEXT(uidUsers)
) ENGINE=InnoDB;
Next, let's create the articles table:
CREATE TABLE articles (
article_id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
title TEXT NOT NULL,
article TEXT NOT NULL,
date DATETIME NOT NULL,
idUsers int(11) NOT NULL,
topic VARCHAR(50) NOT NULL,
published VARCHAR(50) NOT NULL,
PRIMARY KEY (article_id),
FULLTEXT(title, article),
FOREIGN KEY (idUsers) REFERENCES users (idUsers) ON DELETE CASCADE ON UPDATE
CASCADE
) ENGINE=InnoDB;
Finally, we need the articlecomments table:
CREATE TABLE articlecomments (
comment_id INT(11) AUTO_INCREMENT PRIMARY KEY NOT NULL,
message TEXT NOT NULL,
date DATETIME NOT NULL,
article_id INT(11) UNSIGNED NOT NULL,
idUsers INT(11) NOT NULL,
seen TINYTEXT NOT NULL,
FOREIGN KEY (article_id) REFERENCES articles (article_id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (idUsers) REFERENCES users (idUsers) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB;
To populate the tables sufficiently for this example, we will use these statements:
INSERT INTO users (uidUsers, emailUsers, pwdUsers, created) VALUES ('genericUser', 'genericUser#hotmail.com', 'password', NOW());
INSERT INTO articles (title, article, date, idUsers, topic, published) VALUES ('first article', 'first article contents', NOW(), '1', 'other', 'yes');
INSERT INTO articles (title, article, date, idUsers, topic, published) VALUES ('second article', 'second article contents', NOW(), '1', 'other', 'yes');
INSERT INTO articles (title, article, date, idUsers, topic, published) VALUES ('third article', 'third article contents', NOW(), '1', 'other', 'yes');
INSERT INTO articlecomments (message, date, article_id, idUsers, seen) VALUES ('first message', NOW(), '1', '1', 'false');
INSERT INTO articlecomments (message, date, article_id, idUsers, seen) VALUES ('second message', NOW(), '1', '1', 'false');
INSERT INTO articlecomments (message, date, article_id, idUsers, seen) VALUES ('third message', NOW(), '1', '1', 'false');
The two queries that I'm using to obtain data from the articles and articlecomments tables are below:
Query 1:
SELECT
articles.article_id, articles.title, articles.date,
articles.idUsers, users.uidUsers
FROM articles
JOIN users ON articles.idUsers = users.idUsers
WHERE articles.idUsers = '1' AND articles.published = 'yes'
ORDER BY articles.date DESC
LIMIT 5
Query 2:
SELECT
articlecomments.comment_id, articlecomments.message,
articlecomments.date, articlecomments.article_id, users.uidUsers
FROM articlecomments
JOIN users ON articlecomments.idUsers = users.idUsers
WHERE articlecomments.idUsers = '1'
ORDER BY articlecomments.date DESC
LIMIT 5
How would I combine these two queries that contain different information and columns so that they are ordered based on the date of creation (articles.date and articlecomments.date, respectively)? I want them to be in separate rows, not the same row. So, it should be like I queried them separately and simply combined the resulting rows together. If there are three articles and three article comments, I want there to be six total returned rows.
Here's what I want this to look like. Given there are three articles and three article comments, and the comments were created after the articles, this is what the result should look like after combining the queries above (I'm not sure if this portrayal is possible given the different column names but I'm wondering if something similar could be accomplished):
+-------------------------------+-------------------+---------------------+----------------------------------------------------------------+---------+-------------+
| id (article_id or comment_id) | title/message | date | article_id (because it is referenced in articlecomments table) | idUsers | uidUsers |
+-------------------------------+-------------------+---------------------+----------------------------------------------------------------+---------+-------------+
| 1 | first message | 2020-07-07 11:27:15 | 1 | 1 | genericUser |
| 2 | second message | 2020-07-07 11:27:15 | 1 | 1 | genericUser |
| 3 | third message | 2020-07-07 11:27:15 | 1 | 1 | genericUser |
| 2 | second article | 2020-07-07 10:47:35 | 2 | 1 | genericUser |
| 3 | third article | 2020-07-07 10:47:35 | 3 | 1 | genericUser |
| 1 | first article | 2020-07-07 10:46:51 | 1 | 1 | genericUser |
+-------------------------------+-------------------+---------------------+----------------------------------------------------------------+---------+-------------+
Things I have Tried
I have read that this might involve JOIN or UNION operators, but I'm unsure of how to implement them in this situation. I did try combining the two queries by simply using (Query 1) UNION (Query 2), which at first told me that the number of columns were different in my two queries, so I had to remove the idUsers column from my articlecomments query. This actually got me kind of close, but it wasn't formatted correctly:
+------------+-------------------+---------------------+---------+-------------+
| article_id | title | date | idUsers | uidUsers |
+------------+-------------------+---------------------+---------+-------------+
| 2 | first message | 2020-07-07 10:47:35 | 1 | genericUser |
| 3 | third article | 2020-07-07 10:47:35 | 1 | genericUser |
| 1 | first article | 2020-07-07 10:46:51 | 1 | genericUser |
| 1 | second article | 2020-07-07 11:27:15 | 1 | genericUser |
| 2 | third article | 2020-07-07 11:27:15 | 1 | genericUser |
| 3 | first article | 2020-07-07 11:27:15 | 1 | genericUser |
+------------+-------------------+---------------------+---------+-------------+
Any ideas? Let me know if there is any confusion. Thanks.
Server type: MariaDB
Server version: 10.4.8-MariaDB - mariadb.org binary distribution

This seems like MySQL. You could do something like this:
select * from (SELECT articles.article_id as id_article_comment, articles.title as title_message, articles.date as created, 'article' AS contenttype, articles.article_id as article_id, articles.idUsers, users.uidUsers FROM articles JOIN users ON articles.idUsers = users.idUsers WHERE articles.idUsers = '1' AND articles.published = 'yes' ORDER BY articles.date DESC LIMIT 5) a
union all
select * from (SELECT articlecomments.comment_id, articlecomments.message, articlecomments.date, 'article comment' AS contenttype, articlecomments.article_id, articlecomments.idUsers, users.uidUsers FROM articlecomments JOIN users ON articlecomments.idUsers = users.idUsers WHERE articlecomments.idUsers = '1' ORDER BY articlecomments.date DESC LIMIT 5) b
order by created DESC
See example here: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=26280a9c1c5f62fc33d00d93ab84adf3
Result like this:
id_article_comment | title_message | created | article_id | uidUsers
-----------------: | :------------- | :------------------ | ---------: | :----------
1 | first article | 2020-07-09 05:59:18 | 1 | genericUser
2 | second article | 2020-07-09 05:59:18 | 1 | genericUser
3 | third article | 2020-07-09 05:59:18 | 1 | genericUser
1 | first message | 2020-07-09 05:59:18 | 1 | genericUser
2 | second message | 2020-07-09 05:59:18 | 1 | genericUser
3 | third message | 2020-07-09 05:59:18 | 1 | genericUser
Explanation
Since we want to use order by and limit, we'll create a subquery out of the first line and select all columns from that first subquery. We'll name each field the way we want in the output.
We do the same thing with the 2nd query and add a union all clause between them. Then, we apply ordering based on created date (which was an alias in the first query) to get the results you desired in the order you desired.
If you use union, duplicate rows will be eliminated from the result. If you use union all, duplicate rows - if they exist - will be retained. union all is faster since it combines 2 datasets (as long as columns are same in queries. union has to, additionally, look for duplicate rows and remove them from the query.

You don't mention the version of MySQL you are using, so I'll assume it's a modern one (MySQL 8.x). You can produce a row number on each subset using ROW_NUMBER() and then a plain UNION ALL will do the trick.
I fail to understand the exact order you want, and what does the fourth column article_id (because it is referenced in articlecomments table) means. If you elaborate I can tweak this answer accordingly.
The query that produces the result set you want is:
select *
from ( (
SELECT
a.article_id as id, a.title, a.date,
a.article_id, u.uidUsers,
row_number() over(ORDER BY a.date DESC) as rn
FROM articles a
JOIN users u ON a.idUsers = u.idUsers
WHERE a.idUsers = '1' AND a.published = 'yes'
ORDER BY a.date DESC
LIMIT 5
) union all (
SELECT
c.comment_id, c.message, c.date,
c.article_id, u.uidUsers,
5 + row_number() over(ORDER BY c.date DESC) as rn
FROM articlecomments c
JOIN users u ON c.idUsers = u.idUsers
WHERE c.idUsers = '1'
ORDER BY c.date DESC
LIMIT 5
)
) x
order by rn
Result:
id title date article_id uidUsers rn
-- -------------- ------------------- ---------- ----------- --
1 first article 2020-07-10 10:37:00 1 genericUser 1
2 second article 2020-07-10 10:37:00 2 genericUser 2
3 third article 2020-07-10 10:37:00 3 genericUser 3
1 first message 2020-07-10 10:37:00 1 genericUser 6
2 second message 2020-07-10 10:37:00 1 genericUser 7
3 third message 2020-07-10 10:37:00 1 genericUser 8
See running example in db<>fiddle.

you can cross join like this=
select select(1) from FROM [job] WITH (NOLOCK)
WHERE MemberCode = 'pay'
AND CampaignID = '2'
cross join
select(1)
FROM [product] WITH (NOLOCK)
WHERE MemberCode = 'pay'
AND CampaignID = '2'

Related

Find first available value that doesn't exist

I want to create table for book chapters where pk will be book_id and chapter_internal_number. I'm not sure how find next free chapter_internal_number value for new chapter insert (chapter can be deleted and it's chapter_internal_number value should be reused).
How to find first chapter_internal_number avaiable value for book? Avaiable value is next value that doesn't exist in ASC order.
Table book_chapter:
| pk | pk |
| book_id | chapter_internal_number |
| 1 | 1 |
| 1 | 2 |
| 1 | 5 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
Expected:
for book_id=1 is 3
for book_id=2 is 4
Basically, you want the first gap in the chapter numbers for each book. I don't think that you need generate_series() for this; you can just compare the current chapter to the next, using lead():
select book_id, min(chapter_internal_number) + 1
from (
select bc.*,
lead(chapter_internal_number) over(partition by book_id order by chapter_internal_number) lead_chapter_internal_number
from book_chapter bc
) bc
where lead_chapter_internal_number is distinct from chapter_internal_number + 1
group by book_id
This seems to be the most natural way to phrase your query, and I suspect that it should be more efficient that enumerating all possible values with generate_series() (I would be interested to know how both solutions comparatively perform against a large dataset).
We could also use distinct on rather than aggregation in the outer query:
select distinct on (book_id) book_id, chapter_internal_number + 1
from (
select bc.*,
lead(chapter_internal_number) over(partition by book_id order by chapter_internal_number) lead_chapter_internal_number
from book_chapter bc
) bc
where lead_chapter_internal_number is distinct from chapter_internal_number + 1
order by book_id, chapter_internal_number

Can I count the occurences for postgres array field?

I have a table postgres that uses the array type of data, it allows some magic making it possible to avoid having more tables, but the non-standard nature of this makes it more difficult to operate with for a beginner.
I would like to get some summary data out of it.
Sample content:
CREATE TABLE public.cts (
id serial NOT NULL,
day timestamp NULL,
ct varchar[] NULL,
CONSTRAINT ctrlcts_pkey PRIMARY KEY (id)
);
INSERT INTO public.cts
(id, day, ct)
VALUES(29, '2015-01-24 00:00:00.000', '{ct286,ct281}');
INSERT INTO public.cts
(id, day, ct)
VALUES(30, '2015-01-25 00:00:00.000', '{ct286,ct281}');
INSERT INTO public.cts
(id, day, ct)
VALUES(31, '2015-01-26 00:00:00.000', '{ct286,ct277,ct281}');
I would like to get the totals per array member occurence totalized, with an output like this for example:
name | value
ct286 | 3
ct281 | 3
ct277 | 1
Use Postgres function array unnest():
SELECT name, COUNT(*) cnt
FROM cts, unnest(ct) as u(name)
GROUP BY name
Demo on DB Fiddle:
| name | cnt |
| ----- | --- |
| ct277 | 1 |
| ct281 | 3 |
| ct286 | 3 |

SQL index based search

I have a table called Index which has the columns id and value, where id is an auto-increment bigint and value is a varchar with an english word.
I have a table called Search which has relationships to the table Index. For each search you can define which indexes it should search in a table called Article.
The table Article also has relationships to the table Index.
The tables which define the relationships are:
Searches_Indexes with columns id_search and id_index.
Articles_Indexes with columns id_article and id_index.
I would like to find all Articles that contain the same indexes of Search.
For example: I have a Search with indexes laptop and dell, I would like to retrieve all Articles which contain both indexes, not just one.
So far I have this:
SELECT ai.id_article
FROM articles_indexes AS ai
INNER JOIN searches_indexes AS si
ON si.id_index = ai.id_index
WHERE si.id_search = 1
How do I make my SQL only return the Articles with all the Indexes of a Search?
Edit:
Sample Data:
Article:
id | name | description | ...
1 | 'Dell Laptop' | 'New Dell Laptop...' | ...
2 | 'HP Laptop' | 'Unused HP Laptop...' | ...
...
Search:
id | name | id_user | ...
1 | 'Dell Laptop Search' | 5 | ...
Index:
id | value
1 | 'dell'
2 | 'laptop'
3 | 'hp'
4 | 'new'
5 | 'unused'
...
Articles_Indexes:
Article with id 1 (the dell laptop) has the Indexes 'dell', 'laptop', 'new'.
Article with id 2 (the hp laptop) has the Indexes 'laptop', 'hp', 'unused'.
id_article | id_index
1 | 1
1 | 2
1 | 4
...
2 | 2
2 | 3
2 | 5
...
Searches_Indexes:
Search with id 1 only contains 2 Indexes, 'dell' and 'laptop':
id_search | id_index
1 | 1
1 | 2
Required output:
id_article
1
If I understand correctly, you want aggregation and a HAVING clause. Assuming there are no duplicate entries in the indexes tables:
SELECT ai.id_article
FROM articles_indexes ai INNER JOIN
searches_indexes si
ON si.id_index = ai.id_index
WHERE si.id_search = 1
GROUP BY ai.id_article
HAVING COUNT(*) = (SELECT COUNT(*) FROM searches_indexes si2 WHERE si2.id_search = 1);
This counts the number of matches and makes sure it matches the number you are looking for.
I should add this. If you wanted to look for all searches at the same time, I'd be inclined to write this as:
SELECT si.id_search, ai.id_article
FROM articles_indexes ai INNER JOIN
(SELECT si.*, COUNT(*) OVER (PARTITION BY si.id_index) as cnt
FROM searches_indexes si
) si
ON si.id_index = ai.id_index
GROUP BY si.id_search, ai.id_article, si.cnt
HAVING COUNT(*) = si.cnt;
You can compare arrays. Here is some example:
create table article_index(id_article int, id_index int);
create table search_index(id_search int, id_index int);
insert into article_index
select generate_series(1,2), generate_series(1,10);
insert into search_index
select generate_series(1,2), generate_series(1,4);
select
id_article
from article_index
group by id_article
having array_agg(id_index) #> (select array_agg(id_index) from search_index where id_search = 2);
Learn more about arrays in postgres.

Insert into multiple tables

A brief explanation on the relevant domain part:
A Category is composed of four data:
Gender (Male/Female)
Age Division (Mighty Mite to Master)
Belt Color (White to Black)
Weight Division (Rooster to Heavy)
So, Male Adult Black Rooster forms one category. Some combinations may not exist, such as mighty mite black belt.
An Athlete fights Athletes of the same Category, and if he classifies, he fights Athletes of different Weight Divisions (but of the same Gender, Age and Belt).
To the modeling. I have a Category table, already populated with all combinations that exists in the domain.
CREATE TABLE Category (
[Id] [int] IDENTITY(1,1) NOT NULL,
[AgeDivision_Id] [int] NULL,
[Gender] [int] NULL,
[BeltColor] [int] NULL,
[WeightDivision] [int] NULL
)
A CategorySet and a CategorySet_Category, which forms a many to many relationship with Category.
CREATE TABLE CategorySet (
[Id] [int] IDENTITY(1,1) NOT NULL,
[Championship_Id] [int] NOT NULL,
)
CREATE TABLE CategorySet_Category (
[CategorySet_Id] [int] NOT NULL,
[Category_Id] [int] NOT NULL
)
Given the following result set:
| Options_Id | Championship_Id | AgeDivision_Id | BeltColor | Gender | WeightDivision |
|------------|-----------------|----------------|-----------|--------|----------------|
1. | 2963 | 422 | 15 | 7 | 0 | 0 |
2. | 2963 | 422 | 15 | 7 | 0 | 1 |
3. | 2963 | 422 | 15 | 7 | 0 | 2 |
4. | 2963 | 422 | 15 | 7 | 0 | 3 |
5. | 2964 | 422 | 15 | 8 | 0 | 0 |
6. | 2964 | 422 | 15 | 8 | 0 | 1 |
7. | 2964 | 422 | 15 | 8 | 0 | 2 |
8. | 2964 | 422 | 15 | 8 | 0 | 3 |
Because athletes may fight two CategorySets, I need CategorySet and CategorySet_Category to be populated in two different ways (it can be two queries):
One Category_Set for each row, with one CategorySet_Category pointing to the corresponding Category.
One Category_Set that groups all WeightDivisions in one CategorySet in the same AgeDivision_Id, BeltColor, Gender. In this example, only BeltColor varies.
So the final result would have a total of 10 CategorySet rows:
| Id | Championship_Id |
|----|-----------------|
| 1 | 422 |
| 2 | 422 |
| 3 | 422 |
| 4 | 422 |
| 5 | 422 |
| 6 | 422 |
| 7 | 422 |
| 8 | 422 |
| 9 | 422 | /* groups different Weight Division for BeltColor 7 */
| 10 | 422 | /* groups different Weight Division for BeltColor 8 */
And CategorySet_Category would have 16 rows:
| CategorySet_Id | Category_Id |
|----------------|-------------|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
| 7 | 7 |
| 8 | 8 |
| 9 | 1 | /* groups different Weight Division for BeltColor 7 */
| 9 | 2 | /* groups different Weight Division for BeltColor 7 */
| 9 | 3 | /* groups different Weight Division for BeltColor 7 */
| 9 | 4 | /* groups different Weight Division for BeltColor 7 */
| 10 | 5 | /* groups different Weight Division for BeltColor 8 */
| 10 | 6 | /* groups different Weight Division for BeltColor 8 */
| 10 | 7 | /* groups different Weight Division for BeltColor 8 */
| 10 | 8 | /* groups different Weight Division for BeltColor 8 */
I have no idea how to insert into CategorySet, grab it's generated Id, then use it to insert into CategorySet_Category
I hope I've made my intentions clear.
I've also created a SQLFiddle.
Edit 1: I commented in Jacek's answer that this would run only once, but this is false. It will run a couple of times a week. I have the option to run as SQL Command from C# or a stored procedure. Performance is not crucial.
Edit 2: Jacek suggested using SCOPE_IDENTITY to return the Id. Problem is, SCOPE_IDENTITY returns only the last inserted Id, and I insert more than one row in CategorySet.
Edit 3: Answer to #FutbolFan who asked how the FakeResultSet is retrieved.
It is a table CategoriesOption (Id, Price_Id, MaxAthletesByTeam)
And tables CategoriesOptionBeltColor, CategoriesOptionAgeDivision, CategoriesOptionWeightDivison, CategoriesOptionGender. Those four tables are basically the same (Id, CategoriesOption_Id, Value).
The query look like this:
SELECT * FROM CategoriesOption co
LEFT JOIN CategoriesOptionAgeDivision ON
CategoriesOptionAgeDivision.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionBeltColor ON
CategoriesOptionBeltColor.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionGender ON
CategoriesOptionGender.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionWeightDivision ON
CategoriesOptionWeightDivision.CategoriesOption_Id = co.Id
The solution described here will work correctly in multi-user environment and when destination tables CategorySet and CategorySet_Category are not empty.
I used schema and sample data from your SQL Fiddle.
First part is straight-forward
(ab)use MERGE with OUTPUT clause.
MERGE can INSERT, UPDATE and DELETE rows. In our case we need only to INSERT. 1=0 is always false, so the NOT MATCHED BY TARGET part is always executed. In general, there could be other branches, see docs. WHEN MATCHED is usually used to UPDATE; WHEN NOT MATCHED BY SOURCE is usually used to DELETE, but we don't need them here.
This convoluted form of MERGE is equivalent to simple INSERT, but unlike simple INSERT its OUTPUT clause allows to refer to the columns that we need.
MERGE INTO CategorySet
USING
(
SELECT
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,Category.Id AS Category_Id
FROM
FakeResultSet
INNER JOIN Category ON
Category.AgeDivision_Id = FakeResultSet.AgeDivision_Id AND
Category.Gender = FakeResultSet.Gender AND
Category.BeltColor = FakeResultSet.BeltColor AND
Category.WeightDivision = FakeResultSet.WeightDivision
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(Championship_Id
,Price_Id
,MaxAthletesByTeam)
VALUES
(Src.Championship_Id
,Src.Price_Id
,Src.MaxAthletesByTeam)
OUTPUT inserted.id AS CategorySet_Id, Src.Category_Id
INTO CategorySet_Category (CategorySet_Id, Category_Id)
;
FakeResultSet is joined with Category to get Category.id for each row of FakeResultSet. It is assumed that Category has unique combinations of AgeDivision_Id, Gender, BeltColor, WeightDivision.
In OUTPUT clause we need columns from both source and destination tables. The OUTPUT clause in simple INSERT statement doesn't provide them, so we use MERGE here that does.
The MERGE query above would insert 8 rows into CategorySet and insert 8 rows into CategorySet_Category using generated IDs.
Second part
needs temporary table. I'll use a table variable to store generated IDs.
DECLARE #T TABLE (
CategorySet_Id int
,AgeDivision_Id int
,Gender int
,BeltColor int);
We need to remember the generated CategorySet_Id together with the combination of AgeDivision_Id, Gender, BeltColor that caused it.
MERGE INTO CategorySet
USING
(
SELECT
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,FakeResultSet.AgeDivision_Id
,FakeResultSet.Gender
,FakeResultSet.BeltColor
FROM
FakeResultSet
GROUP BY
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,FakeResultSet.AgeDivision_Id
,FakeResultSet.Gender
,FakeResultSet.BeltColor
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(Championship_Id
,Price_Id
,MaxAthletesByTeam)
VALUES
(Src.Championship_Id
,Src.Price_Id
,Src.MaxAthletesByTeam)
OUTPUT
inserted.id AS CategorySet_Id
,Src.AgeDivision_Id
,Src.Gender
,Src.BeltColor
INTO #T(CategorySet_Id, AgeDivision_Id, Gender, BeltColor)
;
The MERGE above would group FakeResultSet as needed and insert 2 rows into CategorySet and 2 rows into #T.
Then join #T with Category to get Category.IDs:
INSERT INTO CategorySet_Category (CategorySet_Id, Category_Id)
SELECT
TT.CategorySet_Id
,Category.Id AS Category_Id
FROM
#T AS TT
INNER JOIN Category ON
Category.AgeDivision_Id = TT.AgeDivision_Id AND
Category.Gender = TT.Gender AND
Category.BeltColor = TT.BeltColor
;
This will insert 8 rows into CategorySet_Category.
Here is not the full answer, but direction which you can use to solve this:
1st query:
select row_number() over(order by t, Id) as n, Championship_Id
from (
select distinct 0 as t, b.Id, a.Championship_Id
from FakeResultSet as a
inner join
Category as b
on
a.AgeDivision_Id=b.AgeDivision_Id and
a.Gender=b.Gender and
a.BeltColor=b.BeltColor and
a.WeightDivision=b.WeightDivision
union all
select distinct 1, BeltColor, Championship_Id
from FakeResultSet
) as q
2nd query:
select q2.CategorySet_Id, c.Id as Category_Id from (
select row_number() over(order by t, Id) as CategorySet_Id, Id, BeltColor
from (
select distinct 0 as t, b.Id, null as BeltColor
from FakeResultSet as a
inner join
Category as b
on
a.AgeDivision_Id=b.AgeDivision_Id and
a.Gender=b.Gender and
a.BeltColor=b.BeltColor and
a.WeightDivision=b.WeightDivision
union all
select distinct 1, BeltColor, BeltColor
from FakeResultSet
) as q
) as q2
inner join
Category as c
on
(q2.BeltColor is null and q2.Id=c.Id)
OR
(q2.BeltColor = c.BeltColor)
of course this will work only for empty CategorySet and CategorySet_Category tables, but you can use select coalese(max(Id), 0) from CategorySet to get current number and add it to row_number, thus you will get real ID which will be inserted into CategorySet row for second query
What I do when I run into these situations is to create one or many temporary tables with row_number() over clauses giving me identities on the temporary tables. Then I check for the existence of each record in the actual tables, and if they exist update the temporary table with the actual record ids. Finally I run a while exists loop on the temporary table records missing the actual id and insert them one at a time, after the insert I update the temporary table record with the actual ids. This lets you work through all the data in a controlled manner.
##IDENTITY is your friend to the 2nd part of question
https://msdn.microsoft.com/en-us/library/ms187342.aspx
and
Best way to get identity of inserted row?
Some API (drivers) returns int from update() function, i.e. ID if it is "insert". What API/environment do You use?
I don't understand 1st problem. You should not insert identity column.
Below query will give final result For CategorySet rows:
SELECT
ROW_NUMBER () OVER (PARTITION BY Championship_Id ORDER BY Championship_Id) RNK,
Championship_Id
FROM
(
SELECT
Championship_Id
,BeltColor
FROM #FakeResultSet
UNION ALL
SELECT
Championship_Id,BeltColor
FROM #FakeResultSet
GROUP BY Championship_Id,BeltColor
)BASE

Write SQL script to insert data

In a database that contains many tables, I need to write a SQL script to insert data if it is not exist.
Table currency
| id | Code | lastupdate | rate |
+--------+---------+------------+-----------+
| 1 | USD | 05-11-2012 | 2 |
| 2 | EUR | 05-11-2012 | 3 |
Table client
| id | name | createdate | currencyId|
+--------+---------+------------+-----------+
| 4 | tony | 11-24-2010 | 1 |
| 5 | john | 09-14-2010 | 2 |
Table: account
| id | number | createdate | clientId |
+--------+---------+------------+-----------+
| 7 | 1234 | 12-24-2010 | 4 |
| 8 | 5648 | 12-14-2010 | 5 |
I need to insert to:
currency (id=3, Code=JPY, lastupdate=today, rate=4)
client (id=6, name=Joe, createdate=today, currencyId=Currency with Code 'USD')
account (id=9, number=0910, createdate=today, clientId=Client with name 'Joe')
Problem:
script must check if row exists or not before inserting new data
script must allow us to add a foreign key to the new row where this foreign related to a row already found in database (as currencyId in client table)
script must allow us to add the current datetime to the column in the insert statement (such as createdate in client table)
script must allow us to add a foreign key to the new row where this foreign related to a row inserted in the same script (such as clientId in account table)
Note: I tried the following SQL statement but it solved only the first problem
INSERT INTO Client (id, name, createdate, currencyId)
SELECT 6, 'Joe', '05-11-2012', 1
WHERE not exists (SELECT * FROM Client where id=6);
this query runs without any error but as you can see I wrote createdate and currencyid manually, I need to take currency id from a select statement with where clause (I tried to substitute 1 by select statement but query failed).
This is an example about what I need, in my database, I need this script to insert more than 30 rows in more than 10 tables.
any help
You wrote
I tried to substitute 1 by select statement but query failed
But I wonder why did it fail? What did you try? This should work:
INSERT INTO Client (id, name, createdate, currencyId)
SELECT
6,
'Joe',
current_date,
(select c.id from currency as c where c.code = 'USD') as currencyId
WHERE not exists (SELECT * FROM Client where id=6);
It looks like you can work out if the data exists.
Here is a quick bit of code written in SQL Server / Sybase that I think answers you basic questions:
create table currency(
id numeric(16,0) identity primary key,
code varchar(3) not null,
lastupdated datetime not null,
rate smallint
);
create table client(
id numeric(16,0) identity primary key,
createddate datetime not null,
currencyid numeric(16,0) foreign key references currency(id)
);
insert into currency (code, lastupdated, rate)
values('EUR',GETDATE(),3)
--inserts the date and last allocated identity into client
insert into client(createddate, currencyid)
values(GETDATE(), ##IDENTITY)
go