SQL-TABLE CREATION - sql

Please explain to me the use of the comma after 'FROM TABLE_ABC A'. how does it work in the execution of the sql query.
CREATE TABLE ABCD AS
( SELECT A.*
FROM TABLE_ABC A,
(SELECT COL_1,COL_2 FROM
(SELECT B.*,C.* FROM
TABLE_XYZ B, TABLE_MNO C
WHERE B.COL_X=C.COL_Y
)D
)A.COL_C=D.COL_D
)
WITH DATA PRIMARY INDEX(SASAJS)

It is similar to join
select * from #tempA ta join #tempB tb
on ta.ID = tb.ID
same as
select * from #tempA ta, #tempB tb
where ta.ID = tb.ID
Using JOINS makes the code easier to read

You're using a select to create a 2nd table that is also created from another subselect. See it like this and you'll understand it better:
CREATE TABLE ABCD AS(
SELECT
A.*
FROM
TABLE_ABC A,
(
SELECT
COL_1,
COL_2
FROM
(
SELECT
B.*,
C.*
FROM
TABLE_XYZ B,
TABLE_MNO C
WHERE
B.COL_X = C.COL_Y
) D
)
WHERE
A.COL_C = D.COL_D
) WITH DATA PRIMARY INDEX(SASAJS)
but your original code is lacking a WHERE CLAUSE before A.COL_C = D.COL_D. I included it.
I'm assuming that B.* or C.* has a column named COL_D AND A also has it. It would also be better if the D as after the ) before the last WHERE

Related

How can you figure out if Column A contains something from Column B?

I've been trying to figure out a way to grab information from Table A Column A compared to Table B Column A, for example:
TableA
Name
abcd_1234_efgh
zxcdde_gets_3214_
jkil_uelso_5555_aseil
uuuu_kkkk_iiii_3333
TableB
ID
1234
3214
5555
3333
I've tried doing an INNER JOIN from Table A to Table B then doing a WHERE TableA.A LIKE TableB.B, but I think I'm missing a section to make it work.
SELECT
a.Name,
b.ID
FROM
TableA a
INNER JOIN
TableB b
ON
a.Name LIKE CAST(b.ID AS STRING)
The result I want from it is:
Name ID
abcd_1234_efgh 1234
zxcdde_gets_3214_ 3214
jkil_uelso_5555_aseil 5555
uuuu_kkkk_iiii_3333 3333
But currently I'm getting nothing as a result. I believe I'm missing something or might be thinking of the wrong way to go about getting the result needed. Any help would be greatly appreciated!
-Maykid
You are close. I think this will work in BigQuery:
SELECT a.Name, b.ID
FROM TableA a INNER JOIN
TableB b
ON a.Name LIKE CONCAT('%', CAST(b.ID AS STRING), '%');
But you may really want:
SELECT a.Name, b.ID
FROM TableA a CROSS JOIN
UNNEST(SPLIT(a.Name, '_')) namepart JOIN
TableB b
ON namepart = CAST(b.ID AS STRING);
This looks like each part of the name separately and allows BigQuery to do an equality join -- which should be more scalable.
Below is for BigQuery Standard SQL
#standardSQL
SELECT *
FROM `project.dataset.tableA`
CROSS JOIN `project.dataset.tableB`
WHERE REGEXP_CONTAINS(Name, id)
you can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.tableA` AS (
SELECT 'abcd_1234_efgh' Name UNION ALL
SELECT 'zxcdde_gets_3214_' UNION ALL
SELECT 'jkil_uelso_5555_aseil' UNION ALL
SELECT 'uuuu_kkkk_iiii_3333'
), `project.dataset.tableB` AS (
SELECT '1234' id UNION ALL
SELECT '3214' UNION ALL
SELECT '5555' UNION ALL
SELECT '3333'
)
SELECT *
FROM `project.dataset.tableA`
CROSS JOIN `project.dataset.tableB`
WHERE REGEXP_CONTAINS(Name, id)
with result
Row Name id
1 abcd_1234_efgh 1234
2 zxcdde_gets_3214_ 3214
3 jkil_uelso_5555_aseil 5555
4 uuuu_kkkk_iiii_3333 3333
Note: using REGEXP_CONTAINS gives you quite a power of regular expressions but it is a little expensive so instead you can use STRPOS() as in example below
#standardSQL
SELECT *
FROM `project.dataset.tableA`
CROSS JOIN `project.dataset.tableB`
WHERE STRPOS(Name, id) > 0
Quick Update:
I just realised that id is not a STRING but rather INT in your question - so:
REGEXP_CONTAINS(Name, id) should be replace with REGEXP_CONTAINS(Name, CAST(id AS STRING))
and same for STRPOS(Name, id)
Given your data structure, maybe something like this helps (note the application of SAFE_CAST):
select name, c, t2.number from (
select t1.name, split(t1.name, "_") splitted from TableA t1
), unnest(splitted) c
left join TableB t2 on t2.number = SAFE_CAST(c as int64)
where number is not null

SQL joined by last date

This is a question asked here before more than once, however I couldn't find what I was looking for. I am looking for join two tables, where the joined table is set by the last register ordered by date time, until here all is ok.
My trouble start on having more than two records on the joined table, let me show you a sample
table_a
-------
id
name
description
created
updated
table_b
-------
id
table_a_id
name
description
created
updated
What I have done at the beginning was:
SELECT a.id, b.updated
FROM table_a AS a
LEFT JOIN (SELECT table_a_id, max (updated) as updated
FROM table_b GROUP BY table_a_id ) AS b
ON a.id = b.table_a_id
Until here I was getting cols, a.id and b.updated. I need the full table_b cols, but when I try to add a new col to my query, Postgres tells me that I need to add my col to a GROUP BY criteria in order to complete the query, and the result is not what I am looking for.
I am trying to find a way to have this list.
DISTINCT ON or is your friend. Here is a solution with correct syntax:
SELECT a.id, b.updated, b.col1, b.col2
FROM table_a as a
LEFT JOIN (
SELECT DISTINCT ON (table_a_id)
table_a_id, updated, col1, col2
FROM table_b
ORDER BY table_a_id, updated DESC
) b ON a.id = b.table_a_id;
Or, to get the whole row from table_b:
SELECT a.id, b.*
FROM table_a as a
LEFT JOIN (
SELECT DISTINCT ON (table_a_id)
*
FROM table_b
ORDER BY table_a_id, updated DESC
) b ON a.id = b.table_a_id;
Detailed explanation for this technique as well as alternative solutions under this closely related question:
Select first row in each GROUP BY group?
Try:
SELECT a.id, b.*
FROM table_a AS a
LEFT JOIN (SELECT t.*,
row_number() over (partition by table_a_id
order by updated desc) rn
FROM table_b t) AS b
ON a.id = b.table_a_id and b.rn=1
You can use Postgres's distinct on syntax:
select a.id, b.*
from table_a as a left join
(select distinct on (table_a_id) table_a_id, . . .
from table_b
order by table_a_id, updated desc
) b
on a.id = b.table_a_id
Where the . . . is, you should put in the columns that you want.

Need to convert DB2 query to TSQL

I'm trying to test the first answer to this question:
SQL - message schema - need to find an existing message thread given a set of users
The first answer to this question is written in DB2 and I'm having a hard time converting the answer to TSQL. Can someone help me figure this out? Here's the query:
WITH Selected_Users(id) as (VALUES (#id1), (#id2), --etc--),
Threads(id) as (SELECT DISTINCT threadFk
FROM ThreadMembers as a
JOIN Selected_Users as b
ON b.id = a.userFk)
SELECT a.id
FROM Threads as a
WHERE NOT EXISTS (SELECT '1'
FROM ThreadMembers as b
LEFT JOIN Selected_Users as c
ON c.id = b.userFk
WHERE c.id IS NULL
AND b.threadFk = a.id)
AND NOT EXISTS (SELECT '1'
FROM Selected_Users as b
LEFT JOIN ThreadMembers as c
ON c.userFk = b.id
AND c.threadFk = a.id
WHERE c.userFk IS NULL)
The description of the query is part of the answer, which helps a lot. The first part of the query creates a temp table called Selected_Users, but I'm not sure how this would be done. Thanks in advance!
I don't think T-SQL allows for the list syntax that DB2 does.
As Andriy M points out, SQL 2008+ does allow a pretty similar syntax:
WITH Selected_Users(id) AS (
SELECT Id FROM (
VALUES (#id1), (#id2), --etc--
) AS V(Id)
),
....
Or you could create a real temp table (or variable):
DECLARE #selected_Users TABLE (id int);
INSERT #selected_Users VALUES
(#id1),
(#id2),
--etc.--
; --make sure to close with semi-colon before WITH CTE
and then replace Selected_Users with #selected_Users in the rest of the query. Or change the initial CTE to:
WITH Selected_Users(id) AS (
SELECT * FROM #selected_Users
),
....
Or, you could do a UNION ALL:
WITH Selected_Users(id) AS (
SELECT #id1
UNION ALL SELECT #id2
UNION ALL SELECT #id3
--etc.--
),
....
I'm unfamiliar with DB2, but if the Selected_Users and Threads "temp tables" are supposed to be CTEs (common table expressions -- basically inline-views), then you'll have to change those to:
WITH Selected_Users(id) AS
(
SELECT #id1 UNION
SELECT #id2
),
Threads(id) AS
(
SELECT DISTINCT
threadFk
FROM
ThreadMembers a
JOIN
Selected_Users b
ON
a.userFk = b.id
)
SELECT
a.Id
FROM
Threads a
WHERE
...
I'll think about the rest and update soon.

SQL: how to find unused primary key

I've got a table with > 1'000'000 entries; this table is referenced from about 130 other tables. My problem is that a lot of those 1-mio-entries is old and unused.
What's the fastet way to find the entries not referenced by any of the other tables? I don't like to do a
select * from (
select * from table-a TA
minus
select * from table-a TA where TA.id in (
select "ID" from (
(select distinct FK-ID "ID" from table-b)
union all
(select distinct FK-ID "ID" from table-c)
...
Is there an easier, more general way?
Thank you all!
You could do this:
select * from table_a a
where not exists (select * from table_b where fk_id = a.id)
and not exists (select * from table_c where fk_id = a.id)
and not exists (select * from table_d where fk_id = a.id)
...
try :
select a.*
from table_a a
left join table_b b on a.id=b.fk_id
left join table_c c on a.id=c.fk_id
left join table_d d on a.id=d.fk_id
left join table_e e on a.id=e.fk_id
......
where b.fk_id is null
and c.fk_id is null
and d.fk_id is null
and e.fk_id is null
.....
you might also try:
select a.*
from table_a a
left join
(select b.fk_id from table_b b union
select c.fk_id from table_c c union
...) table_union on a.id=table_union.fk_id
where table_union.fk_id is null
This is more SQL oriented and it will not take forever like the above solution.
Not sure about efficiency but:
select * from table_a
where id not in (
select id from table_b
union
select id from table_c )
If your concern is allowing the database to continue normal operations while you do the house keeping you could split it into multiple stages:
insert into tblIds
select id from table_a
union
select id from table_b
as may times as you need and then:
delete * from table_a where id not in ( select id from tableIds )
Of course sometimes doing a lot of processing takes a lot of time.
I like #Patrick's answer above, but I would like to add to that.
Rather than building the 130-step query by hand, you could build these INSERT statements by scanning sysObjects, finding key relations and generating your INSERT statements.
That would not only save you time, but should also help you to know for sure whether you've covered all the tables - maybe there are 131, or only 129.
I'm inclined to Marcelo Cantos' answer (and have upvoted it), but here is an alternative in an attempt to circumvent the problem of not having indexes on the foreign keys...
WITH
ids_a AS
(
SELECT id FROM myTable
)
,
ids_b AS
(
SELECT id FROM ids_a WHERE NOT EXISTS (SELECT * FROM table_a WHERE fk_id = ids_a.id)
)
,
ids_c AS
(
SELECT id FROM ids_b WHERE NOT EXISTS (SELECT * FROM table_b WHERE fk_id = ids_b.id)
)
,
...
,
ids_z AS
(
SELECT id FROM ids_y WHERE NOT EXISTS (SELECT * FROM table_y WHERE fk_id = ids_y.id)
)
SELECT * FROM ids_z
All I'm trying to do is to suggest an order to Oracle to minimise its efforts. Unfortunately Oracle will compile this to comething very similar to Marcelo Cantos' answer and it may not performa any differently.

Syntax Error in SQL

SELECT *
INTO Temp3
from
( SELECT B.Name
FROM [Temp2] as B
WHERE B.Name
Not IN (
SELECT E.WorkerName
FROM WorkerDetail as E ) )
Why does this produce an error?
If you want to use a derived table you need to alias it:
SELECT T1.*
INTO Temp3
from
( SELECT B.Name
FROM [Temp2] as B
WHERE B.Name
Not IN (
SELECT E.WorkerName
FROM WorkerDetail as E ) ) AS T1
I'm not sure if you actually need to use a derived table, however.
This should also work:
SELECT B.Name
INTO Temp3
FROM [Temp2] as B
WHERE B.Name
Not IN (
SELECT E.WorkerName
FROM WorkerDetail as E )
Maybe Temp3 already exists?
In MSSQL SELECT..INTO used to populate new table with data.
If this table exist, you can use INSERT INTO .. SELECT FROM statement.