How to convert Oracle SQL query into datalake(Impala)?

How to convert Oracle SQL query into datalake(Impala)? - sql

I have one Oracle SQL query in the below format how I can proceed to convert this into Impala:-
SELECT
Some_Columns
FROM
(
SELECT * FROM
(
SELECT
Some_more_columns
FROM
Main_table
LEFT JOIN t1 on codn_1 = cond_2
....
....
Some more left joins
.....
WHERE
some_filters
)
PIVOT(
count(*) for column_x in (some_column)
)
) T
ORDER BY 2,3

Related

WITH Clause syntax error getting in Teradata

I am attempting to join multiple tables/views inside with clause but it failing. Please help me.
with a as (
select
*
from
db.a
),
b as (
select
*
from
db.b
),
c as (
select
*
from
b
where
col = 'Hi'
),
d as (
SELECT
*
FROM
c P
WHERE
col = 'Y'
),
e as (
SELECT
*
FROM
b P
LEFT JOIN (
SELECT
*
FROM
a
WHERE
col= 'Y'
) M ON p.col1= M.col1
WHERE
P.Date > current_date - 5
)
select
*
from
e
Getting Error any thing wrong in it:
Teradata with clause help
Teradata with clause help
Teradata with clause help
Teradata with clause help
Teradata with clause help

Get name from two ids by joining tables (SQL)

First table
MEMO_ID1
MEMO_ID2
UPDATED_BY
1
2
Bob
Second table
MEMO_ID1
MEMO_NAME
1
UD
2
LD
Result table I want:
MEMO_ID1
MEMO_ID2
UPDATED_BY
UD
LD
Bob
SELECT u.MEMO_ID1, u.MEMO_ID2, u.UPDATED_BY
FROM USER u;
How can I join the user and memo tables to get the names of two different IDs?

try with below:
select t2.MEMO_ID1,t2.MEMO_NAME,t11.UPDATED_BY
from table2 t2
join table1 t11 on t2.MEMO_ID1=t11.MEMO_ID1
join table1 t12 on t2.MEMO_ID1=t12.MEMO_ID2
where t11.UPDATED_BY=t12.UPDATED_BY

You didn't mentioned a database engine you're using so I will do all the stuff with T-SQL. Other DB engines have its own functions for pivoting data
In order to join data from table1 with table2 you need to pivot table1.
For pivoting this data you have 2 ways:
one option is using UNPIVOT
SELECT updated_by, pvt_id
FROM (
SELECT memo_id1, memo_id2, updated_by
FROM t1
) pvt
UNPIVOT (pvt_id FOR col_names IN (memo_id1, memo_id2)) AS unpvt
and another one is UNION data like that
SELECT memo_id1, updated_by
FROM t1
UNION
SELECT memo_id2, updated_by
FROM t1
Now you can join this data with table2 and pivot result back
WITH source AS
(
SELECT updated_by, pvt_id
FROM
(
SELECT memo_id1
,memo_id2
,updated_by
FROM t1
) pvt UNPIVOT (pvt_id FOR col_names IN (memo_id1, memo_id2)) AS unpvt
),
r1 AS
(
SELECT *
FROM source s
LEFT JOIN t2
ON t2.memo_id = s.pvt_id
)
SELECT updated_by, [1] AS [memo1], [2] AS [memo2]
FROM
(
SELECT updated_by, memo_name, memo_id
FROM r1
) pvt
PIVOT (MIN(memo_name) for memo_id IN ([1] , [2])) AS pvt2;
or the same with UNION and PIVOT
WITH source (m_id, updated_by) AS
(
SELECT memo_id1, updated_by
FROM t1 union
SELECT memo_id2, updated_by
FROM t1
),
r1 AS
(
SELECT *
FROM source
LEFT JOIN t2
ON t2.memo_id = t1_data.m_id
)
SELECT updated_by,[1] AS [memo1],[2] AS [memo2]
FROM
(
SELECT updated_by, memo_name, memo_id
FROM r1
) pvt
PIVOT (MIN(memo_name) for memo_id IN ([1], [2])) AS pvt2;
Even a bit simpler solution if you really need to solution on the data of your example without extensibility
with source (m_id, updated_by) as (
select memo_id1, updated_by
from t1
union
select memo_id2, updated_by
from t1
)
select s.updated_by, min(t2.memo_name) [memo1], max(t2.memo_name) [memo2]
from source s
LEFT JOIN t2 on t2.memo_id = s.m_id
group BY s.updated_by
;

Get Distinct From Multiple Similar Tables

I have this query to join a couple tables and get distinct values, it looks something like this:
SELECT DISTINCT [TrackingCode]
,[Opponent]
,CONCAT([TrackingCode], ' | ', [Opponent]) AS RowName
,[MultiYrEvent]
,[Identifier]
FROM [BUDGET_FY2014].[dbo].[TrackingCodes]
INNER JOIN
(
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions]
WHERE Report='2377010003'
) AS T
ON T.EventCode LIKE CAST(TrackingCodes.TrackingCode AS nvarchar(20))+'%'
ORDER BY TrackingCode ASC
It works fine. However, I've got multiple Transactions tables with the same schema for the first and second previous years relative to the Transactions table, and I'd like to see distinct values from all three tables. So for example, if I copy/paste this query and change [Transactions] to [Transactions_Yr1] or [Transactions_Yr2], then I get the data I want from those tables. But, I want to combine the three. If I try to join them all, I get no results returned. I sort of understand why this doesn't work, but I don't know where to go from here:
SELECT DISTINCT [TrackingCode]
,[Opponent]
,CONCAT([TrackingCode], ' | ', [Opponent]) AS RowName
,[MultiYrEvent]
,[Identifier]
FROM [BUDGET_FY2014].[dbo].[TrackingCodes]
INNER JOIN
(
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions]
WHERE Report='2377010003'
) AS T
ON T.EventCode LIKE CAST(TrackingCodes.TrackingCode AS nvarchar(20))+'%'
INNER JOIN
(
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions_Yr1]
WHERE Report='2377010003'
) AS T1
ON T1.EventCode LIKE CAST(TrackingCodes.TrackingCode AS nvarchar(20))+'%'
INNER JOIN
(
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions_Yr2]
WHERE Report='2377010003'
) AS T2
ON T2.EventCode LIKE CAST(TrackingCodes.TrackingCode AS nvarchar(20))+'%'
ORDER BY TrackingCode ASC
Any advice would be appreciated!

Try use UNION ALL clausele, e.g.:
SELECT DISTINCT [FILDS]
FROM (
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions]
WHERE Report='2377010003'
UNION ALL
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions_Yr1]
WHERE Report='2377010003'
UNION ALL
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions_Yr2]
WHERE Report='2377010003'
)
ORDER BY TrackingCode ASC

have you tried unioning your transaction tables together?
Reference: https://msdn.microsoft.com/en-us/library/ms180026.aspx
SELECT DISTINCT [TrackingCode]
,[Opponent]
,CONCAT([TrackingCode], ' | ', [Opponent]) AS RowName
,[MultiYrEvent]
,[Identifier]
FROM [BUDGET_FY2014].[dbo].[TrackingCodes]
INNER JOIN
(
SELECT * FROM(
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions]
WHERE Report='2377010003'
) AS T
ON T.EventCode LIKE CAST(TrackingCodes.TrackingCode AS nvarchar(20))+'%'
Union
(
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions_Yr1]
WHERE Report='2377010003'
) AS T1
ON T1.EventCode LIKE CAST(TrackingCodes.TrackingCode AS nvarchar(20))+'%'
Union
(
SELECT *
FROM [BUDGET_FY2014].[dbo].[Transactions_Yr2]
WHERE Report='2377010003'
) AS T2
ON T2.EventCode LIKE CAST(TrackingCodes.TrackingCode AS nvarchar(20))+'%'
)
ORDER BY TrackingCode ASC

Subqueries with different universes

I have an Oracle DB and I need to run a select with sub selects, however, none of them share the same table universe, therefore, I would need to do something like this:
SELECT (
SELECT COUNT(*)
FROM user_table
) AS tot_user,
(
SELECT COUNT(*)
FROM cat_table
) AS tot_cat,
(
SELECT COUNT(*)
FROM course_table
) AS tot_course
I know this is possible at other databases but I need something like this for Oracle.
Can someone help?

To make this work in oracle, add from dual to the end:
SELECT (SELECT COUNT(*)
FROM user_table
) AS tot_user,
(SELECT COUNT(*)
FROM cat_table
) AS tot_cat,
(SELECT COUNT(*)
FROM course_table
) AS tot_course
FROM dual;
A database independent way of writing the query is:
select tot_user, tot_cat, tot_course
from (SELECT COUNT(*) as tot_user
FROM user_table
) u cross join
(SELECT COUNT(*) as tot_cat
FROM cat_table
) c cross join
(SELECT COUNT(*) as tot_course
FROM course_table
) ct;

Self-join of a subquery

I was wondering, is it possible to join the result of a query with itself, using PostgreSQL?

You can do so with WITH:
WITH subquery AS(
SELECT * FROM TheTable
)
SELECT *
FROM subquery q1
JOIN subquery q2 on ...
Or by creating a VIEW that contains the query, and joining on that:
SELECT *
FROM TheView v1
JOIN TheView v2 on ...
Or the brute force approach: type the subquery twice:
SELECT *
FROM (
SELECT * FROM TheTable
) sub1
LEFT JOIN (
SELECT * FROM TheTable
) sub2 ON ...

Do you mean, the result of a query on a table, to that same table. If so, then Yes, it's possible... e.g.
--Bit of a contrived example but...
SELECT *
FROM Table
INNER JOIN
(
SELECT
UserID, Max(Login) as LastLogin
FROM
Table
WHERE
UserGroup = 'SomeGroup'
GROUP BY
UserID
) foo
ON Table.UserID = Foo.UserID AND Table.Login = Foo.LastLogin

Yes, just alias the queries:
SELECT *
FROM (
SELECT *
FROM table
) t1
JOIN (
SELECT *
FROM table
) t2
ON t1.column < t2.other_column

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to convert Oracle SQL query into datalake(Impala)? - sql

Related

WITH Clause syntax error getting in Teradata

Get name from two ids by joining tables (SQL)

Get Distinct From Multiple Similar Tables

Subqueries with different universes

Self-join of a subquery

Categories

Resources