SQL JOIN on Dynamic Column based on Variable - sql

I have an image summary table [summary] that will serve as a reporting table in the near future. There is a reference table [views] and a third table that the image team populates [TeamImage]. The summary table has 1 row per part number (table has distinct part numbers) and many columns of image views (TOP, BOT, FRO, BAC, etc.). The [views] table lists each of these views with an id field, which is an IDENTITY field. The [TeamImage] table contains part numbers and views (part number field is not unique as the part numbers will be listed multiple times as they have image views).
Example:
TABLE [summary]
Part_Number | TOP | BOT | FRO | BAC |
12345 | | | | |
67890 | | | | |
TABLE [views]
id | View |
1 | TOP |
2 | BOT |
3 | FRO |
4 | BAC |
TABLE [TeamImage]
PartNum | View |
12345 | TOP |
12345 | BOT |
12345 | FRO |
12345 | BAC |
67890 | FRO |
67890 | BAC |
Here's what I need in the end:
TABLE [summary]
Part_Number | TOP | BOT | FRO | BAC |
12345 | 1 | 1 | 1 | 1 |
67890 | | | 1 | 1 |
I could run several update queries but I have 27 views and about 2 million part numbers. I was hoping I could run something like below, even though I know I cannot use a variable as the column name:
DECLARE #id int = (SELECT max(id) FROM [views]), #ViewType nvarchar(3);
WHILE #id IS NOT NULL
BEGIN
SELECT #ViewType = (SELECT [View] FROM [views] WHERE id = #id);
UPDATE a
SET a.[#ViewType] = '1'
FROM [summary] a
INNER JOIN [TeamImage] b
AND a.[Part_Number] = b.[PartNum]
WHERE b.[View] = #ViewType;
SELECT #id = max(id) FROM [views] WHERE id < #id;
END;
Basically, I was hoping to use a variable to grab the different views from the [views] table (id = 27 down to id=1...could have counted up but doesn't matter) and populate the corresponding field in the [summary] table.
I know the SET a.[#ViewType] = '1' won't work, and a colleague of mine mentioned using VB but didn't know if that really was the most efficient option. I understand that I could use a PIVOT on the [TeamImage] table, but I'm not sure that will allow me to update my [summary] table (which has many more fields in it than just the image views). It still seems I need something that will effectively loop through update queries. I could write 4 update queries, one for each view (although my real table has 27 views), but I need something more dynamic in case we add views in the future.

To create your final summary, you can do via a simple pivot, yet this is fixed to the few codes you've done... but I know SQL does have a PIVOT command, but not directly familiar enough with it.
select
TA.PartNum,
max( case when TA.TeamImage = 'TOP' then '1' else ' ' end ) as TOPview,
max( case when TA.TeamImage = 'BOT' then '1' else ' ' end ) as BOTview,
max( case when TA.TeamImage = 'FRO' then '1' else ' ' end ) as FROview,
max( case when TA.TeamImage = 'BAC' then '1' else ' ' end ) as BACview
from
TeamImage TA
group by
TA.PartNum
Obviously simple to expand, but you can also look into the "PIVOT" syntax

I asked the question a little better here: SQL output as variable in VB.net and was able to receive an answer that worked for what I was looking for. I appreciate DRapp providing a solution through PIVOT, but I think the VB way will be easier for me moving forward. In short, using VB with ExecuteScalar and ExecuteNonQuery, I was able to re-write my query using the variables I had above.

Related

Is there a way to ensure WHERE clause happens after DISTINCT?

Imagine you have a table comments in your database.
The comment table has the columns, id, text, show, comment_id_no.
If a user enters a comment, it inserts a row into the database
| id | comment_id_no | text | show | inserted_at |
| -- | -------------- | ---- | ---- | ----------- |
| 1 | 1 | hi | true | 1/1/2000 |
If a user wants to update that comment it inserts a new row into the db
| id | comment_id_no | text | show | inserted_at |
| -- | -------------- | ---- | ---- | ----------- |
| 1 | 1 | hi | true | 1/1/2000 |
| 2 | 1 | hey | true | 1/1/2001 |
Notice it keeps the same comment_id_no. This is so we will be able to see the history of a comment.
Now the user decides that they no longer want to display their comment
| id | comment_id_no | text | show | inserted_at |
| -- | -------------- | ---- | ----- | ----------- |
| 1 | 1 | hi | true | 1/1/2000 |
| 2 | 1 | hey | true | 1/1/2001 |
| 3 | 1 | hey | false | 1/1/2002 |
This hides the comment from the end users.
Now a second comment is made (not an update of the first)
| id | comment_id_no | text | show | inserted_at |
| -- | -------------- | ---- | ----- | ----------- |
| 1 | 1 | hi | true | 1/1/2000 |
| 2 | 1 | hey | true | 1/1/2001 |
| 3 | 1 | hey | false | 1/1/2002 |
| 4 | 2 | new | true | 1/1/2003 |
What I would like to be able to do is select all the latest versions of unique commend_id_no, where show is equal to true. However, I do not want the query to return id=2.
Steps the query needs to take...
select all the most recent, distinct comment_id_nos. (should return id=3 and id=4)
select where show = true (should only return id=4)
Note: I am actually writing this query in elixir using ecto and would like to be able to do this without using the subquery function. If anyone can answer this in sql I can convert the answer myself. If anyone knows how to answer this in elixir then also feel free to answer.
You can do this without using a subquery using LEFT JOIN:
SELECT c.id, c.comment_id_no, c.text, c.show, c.inserted_at
FROM Comments AS c
LEFT JOIN Comments AS c2
ON c2.comment_id_no = c.comment_id_no
AND c2.inserted_at > c.inserted_at
WHERE c2.id IS NULL
AND c.show = 'true';
I think all other approaches will require a subquery of some sort, this would usually be done with a ranking function:
SELECT c.id, c.comment_id_no, c.text, c.show, c.inserted_at
FROM ( SELECT c.id,
c.comment_id_no,
c.text,
c.show,
c.inserted_at,
ROW_NUMBER() OVER(PARTITION BY c.comment_id_no
ORDER BY c.inserted_at DESC) AS RowNumber
FROM Comments AS c
) AS c
WHERE c.RowNumber = 1
AND c.show = 'true';
Since you have tagged with Postgresql you could also make use of DISTINCT ON ():
SELECT *
FROM ( SELECT DISTINCT ON (c.comment_id_no)
c.id, c.comment_id_no, c.text, c.show, c.inserted_at
FROM Comments AS c
ORDER By c.comment_id_no, inserted_at DESC
) x
WHERE show = 'true';
Examples on DB<>Fiddle
I think you want:
select c.*
from comments c
where c.inserted_at = (select max(c2.inserted_at)
from comments c2
where c2.comment_id_no = c.comment_id_no
) and
c.show = 'true';
I don't understand what this has to do with select distinct. You simply want the last version of a comment, and then to check if you can show that.
EDIT:
In Postgres, I would do:
select c.*
from (select distinct on (comment_id_no) c.*
from comments c
order by c.comment_id_no, c.inserted_at desc
) c
where c.show
distinct on usually has pretty good performance characteristics.
As I told in comments I don't advice to pollute data tables with history/auditory stuff.
And no: "double versioning" suggested by #Josh_Eller in his comment isn't a
good solution too: Not only for complicating queries unnecessarily but also for
being much more expensive in terms of processing and tablespace fragmentation.
Take in mind that UPDATE operations never update anything. They instead
write a whole new version of the row and mark the old one as deleted. That's
why vacuum processes are needed to defragment tablespaces in order to
recover that space.
In any case, apart of suboptimal, that approach forces you to implement more
complex queries to read and write data while in fact, I suppose most of the times you will only need to select, insert, update or even delete single row and only eventually, look its history up.
So the best solution (IMHO) is to simply implement the schema you actually need
for your main task and implement the auditory aside in a separate table and
maintained by a trigger.
This would be much more:
Robust and Simple: Because you focus on single thing every time (Single
Responsibility and KISS principles).
Fast: Auditory operations can be performed in an after trigger so
every time you perform an INSERT, UPDATE, or DELETE any possible lock
within the transaction is yet freed because the database engine knows that its outcome won't change.
Efficient: I.e. an update will, of course, insert a new row and mark
the old one as deleted. But this will be done at a low level by the database engine and, more than that: your auditory data will be fully unfragmented (because you only write there: never update). So the overall fragmentation would be always much less.
That being said, how to implement it?
Suppose this simple schema:
create table comments (
text text,
mtime timestamp not null default now(),
id serial primary key
);
create table comments_audit ( -- Or audit.comments if using separate schema
text text,
mtime timestamp not null,
id integer,
rev integer not null,
primary key (id, rev)
);
...and then this function and trigger:
create or replace function fn_comments_audit()
returns trigger
language plpgsql
security definer
-- This allows you to restrict permissions to the auditory table
-- because the function will be executed by the user who defined
-- it instead of whom executed the statement which triggered it.
as $$
DECLARE
BEGIN
if TG_OP = 'DELETE' then
raise exception 'FATAL: Deletion is not allowed for %', TG_TABLE_NAME;
-- If you want to allow deletion there are a few more decisions to take...
-- So here I block it for the sake of simplicity ;-)
end if;
insert into comments_audit (
text
, mtime
, id
, rev
) values (
NEW.text
, NEW.mtime
, NEW.id
, coalesce (
(select max(rev) + 1 from comments_audit where id = new.ID)
, 0
)
);
return NULL;
END;
$$;
create trigger tg_comments_audit
after insert or update or delete
on public.comments
for each row
execute procedure fn_comments_audit()
;
And that's all.
Notice that in this approach you will have always your current comments data
in comments_audit. You could have instead used the OLD register and only
define the trigger in the UPDATE (and DELETE) operations to avoid it.
But I prefer this approach not only because it gives us an extra redundancy (an
accidental deletion -in case it were allowed or the trigger where accidentally
disabled- on the master table, then we would be able to recover all data from
the auditory one) but also because it simplifies (and optimises) querying the
history when it's needed.
Now you only need to insert, update or select (or even delete if you develop a little more this schema, i.e. by inserting a row with nulls...) in a fully transparent manner just like if it weren't any auditory system. And, when you need that data, you only need to query the auditory table instead.
NOTE: Additionally you could want to include a creation timestamp (ctime). In this case it would be interesting to prevent it of being modified in a BEFORE trigger so I omitted it (for the sake of simplicity again) because you can already guess it from the mtimes in the auditory table (even if you are going to use it in your application it would be very advisable to add it).
If you are running Postgres 8.4 or higher, ROW_NUMBER() is the most efficient solution :
SELECT *
FROM (
SELECT c.*, ROW_NUMBER() OVER(PARTITION BY comment_id_no ORDER BY inserted_at DESC) rn
FROM comments c
WHERE c.show = 'true'
) x WHERE rn = 1
Else, this could also be achieved using a WHERE NOT EXISTS condition, that ensures that you are showing the latest comment :
SELECT c.*
FROM comments c
WHERE
c.show = 'true '
AND NOT EXISTS (
SELECT 1
FROM comments c1
WHERE c1.comment_id_no = c.comment_id_no AND c1.inserted_at > c.inserted_at
)
You have to use group by to get the latest ids and the join to the comments table to filter out the rows where show = false:
select c.*
from comments c inner join (
select comment_id_no, max(id) maxid
from comments
group by comment_id_no
) g on g.maxid = c.id
where c.show = 'true'
I assume that the column id is unique and autoincrement in comments table.
See the demo

Converting a number value in a field to a word

I have a SQL query that I'm building and need to convert a number to a word.
The field is titled Type and the values are 1 or 2. I need to convert the 1 to display as Problem and the 2 to display as Resolution.
How would i go about doing this. I built this as an expression in SQL data tools, but we are going in a different direction and need to add it to the query instead and display the report another way.
Thanks!
I'd recommend you have a second table with your ID/Name combination as a lookup and do a JOIN.
That way, as new types come in, you only have to change the name, and not the code.
Although, the syntax would be
CASE WHEN Type = 1 THEN 'Problem'
WHEN Type = 2 THEN 'Resolution' END
You've essentially built the first portion of a normalized database, you've just haven't completed the second portion. The numbers: 1 and 2, can be foreign keys to a second data table that links to the second table's unique auto-incremented ID field:
+-------+ +----+-------------+
| Type | | ID | Description |
+-------+ +----+-------------+
| 1 | | 1 | Problem |
| 2 | | 2 | Resolution |
| 2 | +----+-------------+
| 1 |
+-------+
Which this schema, you can then query the data like such:
SELECT `table2`.`description`
FROM `table2`
INNER JOIN `table1`
ON `table1`.`type` = `table2`.`id`
WHERE `table1`.`type` = 1
Fiddle: Live Demo
What this does is it allows you to add more IDs and Descriptions to your second data table without having to rewrite a bunch of code.
Try below option, it will work with 2 types only (as in question):
Declare #Type int = 1
select
case #Type when 1 then 'Problem'
else 'Resolution'
end as Result
set #Type = 2
select
case #Type when 1 then 'Problem'
else 'Resolution'
end as Result

JOIN two tables, but only include data from first table in first instance of each unique record

Title might be confusing.
I have a table of Cases, and each Case can contain many Tasks. To achieve a different workflow for each Task, I have different tables such as Case_Emails, Case_Calls, Case_Chats, etc...
I want to build a Query that will eventually be exported to Excel. In this query, I want to list out each Task, and the Tasks are already joined together via a UNION in another table using a common format. For each task in the Query, I want only the first Task associated with a case to include the details from Cases table. Example below:
+----+---------+------------+-------------+-------------+-------------+
| id | Case ID | Agent Name | Task Info 1 | Task Info 2 | Task Info 3 |
+----+---------+------------+-------------+-------------+-------------+
| 1 | 4000000 | Some Name | Detailstuff | Stuffdetail | Thingsyo |
| 2 | | | Detailstuff | Stuffdetail | Thingsyo |
| 3 | | | Detailstuff | Stuffdetail | Thingsyo |
| 4 | 4000003 | Some Name | Detailstuff | Stuffdetail | Thingsyo |
| 5 | | | Detailstuff | Stuffdetail | Thingsyo |
| 6 | 4000006 | Some Name | Detailstuff | Stuffdetail | Thingsyo |
+----+---------+------------+-------------+-------------+-------------+
My original approach was attempting a LEFT JOIN on Case ID, but I couldn't figure out how to filter the data out from the extra rows.
This would be much simpler if Access supported the ROW_NUMBER function. It doesn't, but you can sort of simulate it with a correlated subquery using the Tasks table (this assumes that each task has a unique numeric ID). This basically assigns a row number to each task, partitioned by the CaseID. Then you can just conditionally display the CaseID and AgentName where RowNum = 1.
SELECT Switch(RowNum = 1, CaseID) as Case,
Switch(RowNum = 1, AgentName) as Agent,
TaskName
FROM (
SELECT c.CaseID,
c.AgentName,
t.TaskName,
(select count(*)
from Tasks t2
where t2.CaseID = c.CaseID and t2.ID <= t.ID) as RowNum
FROM Cases c
INNER JOIN Tasks t ON c.CaseID = t.CaseID
order by c.CaseID, t.TaskName
)
You didn't post your table structure, so I'm not sure this will work for you as-is, but maybe you can adapt it.
No matter what when you join you will have duplicate values. to remove the duplicates either put in a Distinct in your select or a Group by after your filters. This should resolve the duplicates in you query for task info 1,2,3.
Found out that I can name my tables in the query like so:
FROM Case_Calls Calls
With this other name, I was able to filter based on a sub query:
IIF( Calls.[ID] <> (select top 1 [ID] from Case_Calls where [Case ID] = Calls.[Case ID]), '', Cases.[Creator]) As [Case Creator]
This solution gives me the results that I want :) It's rather ugly SQL, and difficult to parse when I'm dealing with dozens of columns, but it gets the job done!
I'm still curious if there is a better solution...

Language fallback with database content in OpenCart

I've just (6 months+) started learning all the web languages, mostly within OpenCart's framework. Right now I'm trying to get language fallback to work with database content.
The objective is to check for an empty field in id.title and if so choose the default language_id=1.
The language_id comes from a GET request invoked by the frontend user.
The table description looks like this:
------------------------------------------------------
| information_id | language_id | title | description |
------------------------------------------------------
| 3 | 1 | policy | policy desc |
------------------------------------------------------
| 4 | 1 | about | about desc |
------------------------------------------------------
| 4 | 2 | | |
------------------------------------------------------
| 5 | 1 | terms | terms desc |
------------------------------------------------------
| 6 | 1 | comp | comp desc |
------------------------------------------------------
As you can see language_id=2 has no title nor description (inserted by sql, not oc's admin). In this case I want to get the row with the default language=1.
I've tried using CASE but the results are always empty. The problem I can't find a solution for this is to check the title field next to the requested language_id.
I've also tried to first check the field before doing a SELECT, but no success.
SELECT DISTINCT *
FROM information_description id
WHERE id.information_id = '4'
AND id.language_id = (CASE WHEN id.title = '' THEN '1' ELSE '2' END);
Any help would be appreciated.
Here You have two options. Either do this by subselects or once per save/update walk through all the information descriptions and update the missing languages with the texts from the default ones.
The first solution could be:
SELECT id.information_id, id.language_id, id.description,
CASE WHEN id.title IS NOT NULL /* or CASE WHEN id.title <> '' - depending on the real value in DB when it's empty */
THEN id.title
ELSE (SELECT title FROM information_description WHERE language_id = 1)
AS title
FROM information_description id
WHERE id.language_id = 2
I would call such query only in the case when the language ID differs from the default one. Since this may look like working solution I don't like it simply because it increases the DB effort.
Instead of this I recommend to simply update Your missing data in similar way (this could be done maybe only once per life and you are done):
UPDATE information_description id SET
id.title = (SELECT title FROM information_description WHERE language_id = 1 AND information_id = id.information_id)
WHERE id.language_id <> 1
AND id.title IS NULL
for title and
UPDATE information_description id SET
id.description = (SELECT description FROM information_description WHERE language_id = 1 AND information_id = id.information_id)
WHERE id.language_id <> 1
AND id.description IS NULL
for description fields...

Fetch Id's that are related to a specific set of items, but not others

Good morning all, apologies for the title... i had trouble simplifying the problem down to a line. My database platform is Teradata.
I am working w/ a table like the following (let's call it "t1")
+------------+----------------------------------------+
| Service_Id | Product |
+------------+----------------------------------------+
| 1 | Traffic |
| 1 | Weather |
| 1 | Travel |
| 1 | Audio |
| 1 | Audio Add-on |
| 2 | Traffic |
| 2 | Weather |
| 2 | Travel |
+------------+----------------------------------------+
I am trying to select service_id's that are related to the following products AND ONLY the following products: Traffic, Weather, Travel
"Service_Id = 1" does not apply here because while it has the required products, it also has an "audio" product related to it... so we have to leave it out. I was able to successfully do this through a series of temp (volatile) tables but it's feeling really hacky and I feel there's got to be a better way. Thanks for your assistance.
I'm doing stuff like that (find a subset/superset/exact match for a set of rows) in my training classes using pizzas :-)
There are several ways to get your result, but for an exact match the easiest way is a SUM using following logic:
SELECT service_id
FROM t1
GROUP BY 1
HAVING
SUM(CASE WHEN Product IN ('Traffic', 'Weather', 'Travel') THEN 1 ELSE -1 END = 3
Assuming that Product is unique for every service_ID.
SELECT service_ID
FROM tableName a
WHERE Product IN ('Traffic', 'Weather', 'Travel') AND
EXISTS
(
SELECT 1
FROM tableName b
WHERE a.Service_ID = b.Service_ID
GROUP BY b.Service_ID
HAVING COUNT(*) = 3 -- <<== total number of products
)
GROUP BY service_ID
HAVING COUNT(*) = 3 -- <<== total number of products
SQLFiddle Demo (demo is running under MySQL database, not sure if it will work on teradata)