Deciphering SQL query - sql

I am currently reviewing a query without access to the databases on which the query is performed. (It's not ideal but that's what I am tasked with). I am not a SQL expert and trying to identify what the below code does as I cannot run the query. It is reading from and writing to the same temp table (duplicating?). I don't know what the source of 'Y' is or what the end result is. Any help is appreciated. Thank you.
INSERT INTO #temp1
SELECT X.CURSTATUS ,X.GENDER ,Y.PACKAGE ,X.AGE ,1 AS factor1 ,1 AS factor2;
FROM #temp1 X WITH (NOLOCK) ,
( SELECT 'P1' AS PACKAGE UNION ALL SELECT 'P2' ) Y
WHERE X.PACKAGE = 'P5';

It is not really writing to the same table. It is "appending" rows to the same table. That is, existing data in the table is not affected.
What it is doing is adding rows for packages "P1" and "P2" for all "P5" packages. This adds the new rows to the table; the "P5" row remains.

For every row in #temp that has a PACKAGE value of "P5", the query is inserting two new rows with PACKAGE values of "P1" & "P2" respectivlly.
Reformatting the query and replacing obsolete syntax with modem syntax should make it easier to understand.
INSERT INTO #temp1 (CURSTATUS, GENDER, PACKAGE, AGE, factor1, factor2)
SELECT
X.CURSTATUS,
X.GENDER,
Y.PACKAGE,
X.AGE,
1 AS factor1,
1 AS factor2
FROM
#temp1 X
CROSS JOIN (
SELECT 'P1' AS PACKAGE
UNION ALL
SELECT 'P2'
) Y
WHERE
X.PACKAGE = 'P5';

INSERT INTO #temp1
-- this is where that data is being inserted into. It should BTW have columns explicitly defined, this format is a SQL antipattern
SELECT X.CURSTATUS ,X.GENDER ,Y.PACKAGE ,X.AGE ,1 AS factor1 ,1 AS factor2;
FROM #temp1 X WITH (NOLOCK) ,
-- this is selecting the current rows from #temp
( SELECT 'P1' AS PACKAGE UNION ALL SELECT 'P2' ) Y
--Y is a two record table with one column called package, since there is no specific join shown, it is a cross join - Again an antipattern, it is far better to explicitly use the Cross Join keywords to make it clear what is going on.
WHERE X.PACKAGE = 'P5';
-- this filters the records from #temp to grab only those where the record values is 'P5'. Since it cross joins to the Two record table Y, it takes the data in the other columns for the P% records and inserts new records for P1 and P2. If you have ten P5 records, this insert would insert 10 P1 records and 10 P2 records.

Related

Generating Lines based on a value from a column in another table

I have the following table:
EventID=00002,DocumentID=0005,EventDesc=ItemsReceived
I have the quantity in another table
DocumentID=0005,Qty=20
I want to generate a result of 20 lines (depending on the quantity) with an auto generated column which will have a sequence of:
ITEM_TAG_001,
ITEM_TAG_002,
ITEM_TAG_003,
ITEM_TAG_004,
..
ITEM_TAG_020
Here's your sql query.
with cte as (
select 1 as ctr, t2.Qty, t1.EventID, t1.DocumentId, t1.EventDesc from tableA t1
inner join tableB t2 on t2.DocumentId = t1.DocumentId
union all
select ctr + 1, Qty, EventID, DocumentId, EventDesc from cte
where ctr <= Qty
)select *, concat('ITEM_TAG_', right('000'+ cast(ctr AS varchar(3)),3)) from cte
option (maxrecursion 0);
Output:
Best is to introduce a numbers table, very handsome in many places...
Something along:
Create some test data:
DECLARE #MockNumbers TABLE(Number BIGINT);
DECLARE #YourTable1 TABLE(DocumentID INT,ItemTag VARCHAR(100),SomeText VARCHAR(100));
DECLARE #YourTable2 TABLE(DocumentID INT, Qty INT);
INSERT INTO #MockNumbers SELECT TOP 100 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values;
INSERT INTO #YourTable1 VALUES(1,'FirstItem','qty 5'),(2,'SecondItem','qty 7');
INSERT INTO #YourTable2 VALUES(1,5), (2,7);
--The query
SELECT CONCAT(t1.ItemTag,'_',REPLACE(STR(A.Number,3),' ','0'))
FROM #YourTable1 t1
INNER JOIN #YourTable2 t2 ON t1.DocumentID=t2.DocumentID
CROSS APPLY(SELECT Number FROM #MockNumbers WHERE Number BETWEEN 1 AND t2.Qty) A;
The result
FirstItem_001
FirstItem_002
[...]
FirstItem_005
SecondItem_001
SecondItem_002
[...]
SecondItem_007
The idea in short:
We use an INNER JOIN to get the quantity joined to the item.
Now we use APPLY, which is a row-wise action, to bind as many rows to the set, as we need it.
The first item will return with 5 lines, the second with 7. And the trick with STR() and REPLACE() is one way to create a padded number. You might use FORMAT() (v2012+), but this is working rather slowly...
The table #MockNumbers is a declared table variable containing a list of numbers from 1 to 100. This answer provides an example how to create a pyhsical numbers and date table. Any database should have such a table...
If you don't want to create a numbers table, you can search for a tally table or tally on the fly. There are many answers showing approaches how to create a list of running numbers...a

Database Trigger that finds nearest record from another table after insert

I have two tables(a,b), both with a shape or geometry field. I want a trigger to run after an insert on table a to find the (single) nearest spatial record from table b. I have looked into the STDistance function with little luck. Table a is unique.
AFTER INSERT
Table a
OBJECTID,RoadID
12345,NULL
Table b
AssetID
RD12345
RD12233
RD12333
RD12222
STDistnace would say Table a.OBJECTID 12345 nearest Table b.AssetID = RD12222
Result
Table a
OBJECTID,RoadID
12345,RD12222
I have completed some preliminary testing which returns all matching records (from both tables) but I am trying to condense it down to only the matching record with the lowest distance, hence the aggregate function(MIN) on STDistance.
SELECT TableA.AssetID,MIN(TableA.Shape.STDistance(TableB.Shape)) AS DIST, TableB.AssetID AS RoadID
FROM TableA, TableB
GROUP BY TableA.AssetID, TableB.AssetID
HAVING MIN(TableA.Shape.STDistance(TableB.Shape)) < 250
ORDER BY AssetID
The result I get is a many to many relationship by distance for all records. If I apply the aggregate function(MIN) I can reduce it significantly however the Table a unique id's still duplicate. The plan is once the select statement worked I would translate it into my trigger - I would prefer the answer to be based on how it would be implemented in a trigger.
You may use CROSS APPLY ... SELECT TOP 1.... ORDER BY distance to cross join two tables and select the nearest record:
SELECT A.OBJECTID, NearestB.B_ID, NearestB.Distance
FROM TableA A
CROSS APPLY(
select TOP 1
A.Shape.STDistance(B.Shape) AS distance,
B.AssetID as B_ID
from TableB B
order by 1
) NearestB
And the trigger might be:
CREATE TRIGGER TableA_Insertion_SetNearestB ON TableA
INSTEAD OF INSERT
AS
BEGIN
INSERT INTO TableA (
OBJECTID,
Shape,
RoadID
) SELECT
INSERTED.OBJECTID,
INSERTED.Shape,
NearestB.B_ID
) FROM
INSERTED
CROSS APPLY(
select TOP 1
INSERTED.Shape.STDistance(B.Shape) AS distance,
B.AssetID as B_ID
from TableB B
order by 1
) NearestB
END
GO

SELECT VALUES in Teradata

I know that it's possible in other SQL flavors (T-SQL) to "select" provided data without a table. Like:
SELECT *
FROM (VALUES (1,2), (3,4)) tbl
How can I do this using Teradata?
Teradata has strange syntax for this:
select t.*
from (select * from (select 1 as a, 2 as b) x
union all
select * from (select 3 as a, 4 as b) x
) t;
I don't have access to a TD system to test, but you might be able to remove one of the nested SELECTs from the answer above:
select x.*
from (
select 1 as a, 2 as b
union all
select 3 as a, 4 as b
) x
If you need to generate some random rows, you can always do a SELECT from a system table, like sys_calendar.calendar:
SELECT 1, 2
FROM sys_calendar.calendar
SAMPLE 10;
Updated example:
SELECT TOP 1000 -- Limit to 1000 rows (you can use SAMPLE too)
ROW_NUMBER() OVER() MyNum, -- Sequential numbering
MyNum MOD 7, -- Modulo operator
RANDOM(1,1000), -- Random number between 1,1000
HASHROW(MyNum) -- Rowhash value of given column(s)
FROM sys_calendar.calendar; -- Use as table to source rows
A couple notes:
make sure you pick a system table that will always be present and have rows
if you need more rows than are available in the source table, do a UNION to get more rows
you can always easily create a one-column table and populate it to whatever number of rows you want by INSERT/SELECT into it:
CREATE DummyTable (c1 INT); -- Create table
INSERT INTO DummyTable(1); -- Seed table
INSERT INTO DummyTable SELECT * FROM DummyTable; -- Run this to duplicate rows as many times are you want
Then use this table to create whatever resultset you want, similar to the query above with sys_calendar.calendar.
I don't have a TD system to test so you might get syntax errors...but that should give you a basic idea.
I am a bit late to this thread, but recently got the same error.
I solved this by simply using
select distinct 1 as a, 2 as b from DBC.tables
union all
select distinct 3 as a, 4 as b from DBC.tables
Here, DBC.tables is a DB backend table with a few rows only. So, the query runs fast as well

Insert values in table only one column changes value

I've got a table with 2 columns,
GROUP PROJECTS
10001 1
10001 2
First column (GROUP) stays the same value 10001.
Second column (PROJECTS) changes values 3,5,9,100, etc. (I have 400 project ID's)
What would be to correct (loop?) statement to insert all 400 PROJECTS.
I used insert, values for smaller lists.
INSERT INTO table (GROUP_ID, PROJECTS) VALUES (10001, 1, 10001, 2, 10001, etc, 10001, etc);
I have the list in Excel (if needed I can create a Temp table with all 400 project ID's)
Thanks.
I typically write such inserts as:
INSERT INTO table(GROUP_ID, PROJECTS)
select 10001, 1 from dual union all
select 10001, 2 from dual union all
. . . ;
You should be able to generate the select statement pretty easily in Excel.
If the project IDs exist in their own table (or you can create one from your Excel data), then yu can get the list of values from there and cross-join those with all the group IDs:
insert into group_projects (group_id, project_id)
select g.group_id, p.project_id
from groups g
cross join projects p
where not exists (
select 1 from group_projects gp
where gp.group_id = g.group_id and gp.project_id = p.project_id
);
The where not exists() excludes all the existing pairs so you don't insert duplicates.
SQL Fiddle
If the groups don't have their own table then you can use the existing values from a subquery:
insert into group_projects (group_id, project_id)
select g.group_id, p.project_id
from (select distinct group_id from group_projects) g
cross join projects p
where not exists (
select 1 from group_projects gp
where gp.group_id = g.group_id and gp.project_id = p.project_id
);
You could use Gordon's approach to generate the project ID list as a subquery as well, if you didn't want to create a table for those.
I'd go with what I view as a simpler and much more readable solution... create the temp table with the data from Excel, then run this-
DECLARE
CURSOR c1
IS
SELECT project_id
FROM temp_table
WHERE project_id IS NOT NULL;
BEGIN
FOR rec in c1
LOOP
INSERT INTO table
VALUES (10001, rec.project_id);
COMMIT;
END LOOP;
END;
Seems cleaner than one giant insert statement or something complex with joins and sub-queries. If you wanted to make sure the value doesn't already exist in "table", add that criteria to the cursor select statement, or if you have constraints on the table add an exception handler in the loop.

Insert into Teradata

insert into tablex (a,b,c)
select distinct a,b,c
from tableA;
when I run select distinct statement alone it shows 6 rows.
When I run with insert it shows 0 rows inserted .
Is it a bug or AM I missing some thing.
#Teradata 13.10
masked original Query
INSERT INTO tablex
(SYSTEM_ID,START_DATE,END_DATE,CURRENT_FLAG )
SELECT DISTINCT
,s.SYSTEM_ID
,s.trans_DATE
,DATE '9999-12-31'
,'X'
FROM s JOIN cc
ON s.var_id=cc.var_id
WHERE s.sno = cc.sno
AND s.sno<>s.orino AND s.orino IS NOT NULL AND s.orino <> ''
AND cc.end_date=s.trans_date-1;
It's not a bug :-)
All six rows existed already in the target table and it's a SET table which automatically removes duplicate rows during an Insert/Select.