Alternate to Left Anti Join in Hive - hive

I am able to use Left anti join in Imapala but same I am not able to use in Hive, the reason I am using left anti join is I want to exclude parent code and child code from the main table using external table having on parent code information. in Hive tried using to NOT IN subqueries but it also doesnt work
Main Table
dsc p_code c_code
asd 100 0
qwe 200 200
wrt 300 100
zrt 400 600
xqt 500 500
External table
say just have p_code = 100
using this external table p_code, I want to exclude this data in p_code and c_code from main table
This is the query I tried
(select 'asd' as dsc, 100 as p_code, 000 c_code
union all
select 'qwe' as dsc, 200 as p_code, 200 c_code
union all
select 'wrt' as dsc, 300 as p_code, 100 c_code
union all
select 'zrt' as dsc, 400 as p_code, 600 c_code
union all
select 'xqt' as dsc, 500 as p_code, 500 c_code
),
ext_codes as
(select 100 as p_code )
select * from test_codes
left anti join ext_codes
on test_codes.p_code = ext_codes.p_code
or test_codes.c_code = ext_codes.p_code
;```

Related

Oracle SQL - join two tables + show unmatched results

I have two tables:
POSITION_TABLE
Account
Security
Pos_Quantity
1
A
100
2
B
200
TRADE_TABLE
Account
Security
Trade_Quantity
1
A
50
2
C
10
I want to join them in a way that matching rows are displayed as one row, but unmatching rows are also displayed, so standard LEFT JOIN wouldnt work.
Expected output:
Account
Security
Pos_Quantity
Trade_Quantity
1
A
100
50
2
B
200
0
2
C
0
10
How do I do that?
A full outer join would work nicely here:
with position_table as (select 1 account, 'A' security, 100 pos_quantity from dual union all
select 2 account, 'B' security, 200 pos_quantity from dual),
trade_table as (select 1 account, 'A' security, 50 trade_quantity from dual union all
select 2 account, 'C' security, 10 trade_quantity from dual)
select coalesce(pt.account, tt.account) account,
coalesce(pt.security, tt.security) security,
coalesce(pt.pos_quantity, 0) pos_quantity,
coalesce(tt.trade_quantity, 0) trade_quantity
from position_table pt
full outer join trade_table tt on pt.account = tt.account
and pt.security = tt.security
order by account,
security;
db<>fiddle - note how you can see that the full outer join works just fine with subqueries defined in a where clause!

SELECT in SQL Server to show all distinct rows from the tables OnHand, Sale and Purchase that have either/or Qty field not empty

I need to write a SELECT query in SQL Server which uses a JOIN or UNION that selects distinct ItmNo or Code rows from 3 tables OnHand, Sale and Purchase.
Here are the details of the tables I have and what I need. ItmNo and/or Code columns can be used as foreign keys to join the tables.
These are my input tables-
Table OnHand
ID ItmNo Code Qty
----------------------------------
1 I001 001 100
2 I001 001 50
3 I003 003 300
Table Sale
ID ItmNo Code Qty
----------------------------------
1 I001 001 100
2 I004 004
3 I003 003 120
Table Purchase
ID ItmNo Code Qty
----------------------------------
1 I005 005 10
2 I003 003 200
3 I003 003 300
And this is what I need as output. Only DISTINCT ItmNo and Code should be displayed here:
ID ItmNo Code SumQtyOnHand SumQtyOnSale SumQtyOnPurchase
------------------------------------------------------------------------------
1 I001 001 150 100
2 I003 003 300 120 500
3 I005 005 10
Here is the SELECT query that I have tried is below but I cannot get the output I want-
SELECT
A.ItmNo, A.Code,
A2.TOTAL SumQtyOnHand,
B.TOTAL SumQtyOnSale,
C.TOTAL SumQtyOnPurchase
FROM
dbo.OnHand A
LEFT JOIN
(SELECT ItmNo, Code, SUM(Qty) TOTAL
FROM dbo.OnHand
GROUP BY ItmNo, Code) A2 ON A.ItmNo = A2.ItmNo
LEFT JOIN
(SELECT ItmNo, Code, SUM(Qty) TOTAL
FROM dbo.Sale
GROUP BY ItmNo, Code) B ON A.ItmNo = A2.ItmNo
LEFT JOIN
(SELECT ItmNo, Code, SUM(Qty) TOTAL
FROM dbo.Purchase
GROUP BY ItmNo, Code) C ON A.ItmNo = A2.ItmNo
Please suggest the correction in the SELECT query to achieve the above output.
Thanks in advance!
I think you are on the right track with the prea-ggregation subqueries. Then, you can full join. The syntax is a bit cumbersome in SQL Server, that does not support the using() clause:
select
coalesce(o.itmno, s.itemno, p.itemno) as itemno,
coalesce(o.code, s.code, p.code) as code,
o.SumQtyOnHand,
s.SumQtyOnSale,
p.SumQtyOnPurchase
from (
select itmno, code, sum(qty) SumQtyOnHand
from dbo.onhand
group by itmno, code
) o
full join (
select itmno, code, sum(qty) SumQtyOnSale
from dbo.sale
group by itmno, code
) s on s.itmno = o.itmno and s.code = o.code
full join (
select itmno, code, sum(qty) SumQtyOnPurchase
from dbo.purchase
group by itmno, code
) p on p.itemno = coalesce(s.itemno, o.itemno) and p.code = coalesce(s.code, o.code)
It might be simpler expressed with union all and aggregation:
select itemno, code,
sum(qtyOnHand) as SumQtyOnHand,
sum(qtyOnSale) as SumQtyOnSale,
sum(qtyOnPurchase) as SumQtyOnPurchase
from (
select itemno, code, qty as qtyOnHand, null as qtyOnSale, null as qtyOnPurchase from dbo.onhand
union all select itemno, code, null, qty, null from dbo.sale
union all select itemno, code, null, null, qty from dbo.purchase
) t
group by itemno, code

Group Information in joined table for last available year

So I have this table with an unique identifier and its group; I want to get totals of another table based on its Group
Table 1
UNIQUE_ID ! Group
1 West
2 West
3 West
4 West
5 West
6 East
7 East
Then I have this second table from with which I join
Table 2
UNIQUE_ID ! NET PROFIT ! ASSETS ! EQUITY ! YEAR
1 100 100 100 2016
1 100 100 100 2015
2 100 100 100 2016
2 100 100 100 2015
3 100 100 100 2016
3 100 100 100 2015
***4 10 10 10 2015***
5 100 100 100 2016
5 100 100 100 2015
***6 10 10 10 2014***
7 100 100 100 2016
7 100 100 100 2015
7 100 100 100 2014
I link the previous tables and I group by Group which then gives me the totals for NEt profit, assets and equity, the problem is that it sums all the years available in table 2 Or I do a where year is 2016 and I only get the totals from 2016 not including the rows which id have the latest year of 2015 or 2014
I need it to group by Group and sum only the last available year for each unique ID, so that I would get the following table
Query would give me this table
Group ! NET PROFIT ! ASSETS ! EQUITY
East 410 410 410
West 110 110 110
Can anyone help me? I've looked everywhere and tried a number of combinations but without success
Consider a join of an aggregate derived table to the join of other two tables where essentially the last INNER JOIN acts as a WHERE clause to filter select years for each unique_id.
SELECT t1.`GROUP`, SUM(t2.NET_PROFIT) AS SUM_NET_PROFIT,
SUM(t2.ASSETS) AS SUM_ASSETS,
SUM(t2.EQUITY) AS SUM_EQUITY
FROM (`table2` t2
INNER JOIN `table1` t1
ON t1.UNIQUE_ID = t2.UNIQUE_ID)
INNER JOIN
(SELECT t2.UNIQUE_ID, MAX(t2.`YEAR`) AS MAX_YEAR
FROM `table2` t2
GROUP BY t2.UNIQUE_ID) g
ON t2.`UNIQUE_ID` = g.`UNIQUE_ID` AND t2.`YEAR` = g.`MAX_YEAR`
GROUP BY t1.`GROUP`;
Do note the parentheses used to wrap first join pairing of tables, required in MS Access.
Is this what you want?
SELECT Group_,
SUM(NET_PROFIT) AS NET_PROFIT_YR,
SUM(ASSETS) AS ASSETS_YR,
SUM(EQUITY) As EQUITY_YR
FROM Table1 AS T1
INNER
JOIN (SELECT T2_RAW.*
FROM ( SELECT Unique_ID,
MAX(year) AS m_year
FROM Table2 AS T2
GROUP
BY Unique_ID
) AS MYR
INNER
JOIN Table2 AS T2_RAW
ON MYR.unique_id = T2_RAW.unique_id
AND MYR.m_year = T2_RAW.year
) AS TMP
ON T1.unique_id = TMP.unique_id
GROUP
BY group_;
I. The complete SQL statement:
SELECT
ta.group,
SUM(IFNULL(tb.net_profit, 0)) as sumNetProfit,
SUM(IFNULL(tb.assets, 0)) as sumAssets,
SUM(IFNULL(tb.equity, 0)) as sumEquity
FROM table_a AS ta
LEFT JOIN (
SELECT
unique_id,
max(year) as maxYear
FROM table_b
GROUP BY unique_id
) AS tbMaxYears ON tbMaxYears.unique_id = ta.unique_id
LEFT JOIN table_b AS tb ON
tb.unique_id = ta.unique_id
AND tb.year = tbMaxYears.maxYear
GROUP BY ta.group;
II. Description:
The inner query:
SELECT
unique_id,
max(year) as maxYear
FROM table_b
GROUP BY unique_id
Selects the unique_ids from table_b and, for each unique_id, the corresponding latest year, e.g. the maximal year;
Groups the fetched records by the unique_id;
Is used in a LEFT JOIN statement under the alias maxYears and its results are filtered by the unique_ids fetched from the table_a.
The inner query results looks like following:
unique_id maxYear
-------------------
1 2016
2 2016
3 2016
4 2015
5 2016
6 2014
7 2016
The outer query:
SELECT
ta.group,
SUM(IFNULL(tb.net_profit, 0)) as sumNetProfit,
SUM(IFNULL(tb.assets, 0)) as sumAssets,
SUM(IFNULL(tb.equity, 0)) as sumEquity
FROM table_a AS ta
LEFT JOIN (
<THE-INNER-QUERY-RESULTS>
) AS tbMaxYears ON tbMaxYears.unique_id = ta.unique_id
LEFT JOIN table_b AS tb ON
tb.unique_id = ta.unique_id
AND tb.year = tbMaxYears.maxYear
GROUP BY ta.group;
Reads all table_a records;
Attaches the maxYears details (fetched through the inner query);
Attaches the table_b details;
Groups the records by the group column;
Calculates the corresponding sums (including NULL values validation).
The results of the outer query (e.g. the final results):
group sumNetProfit sumAssets sumEquity
---------------------------------------------
East 110 110 110
West 410 410 410
The second LEFT JOIN in the outer query:
LEFT JOIN table_b AS tb ON
tb.unique_id = ta.unique_id
AND tb.year = tbMaxYears.maxYear
Attaches (joins) the table_b details to the records fetched from table_a;
Only the table_b records are attached, which have the same unique_id as the corresponding unique_id value from table_a AND (!) the same year as the corresponding maxYear value from maxYears table.
III. Used table structure:
I used a MySQL database with the following CREATE TABLE syntax:
CREATE TABLE `table_a` (
`unique_id` int(11) DEFAULT NULL,
`group` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `table_b` (
`unique_id` int(11) DEFAULT NULL,
`net_profit` int(11) DEFAULT NULL,
`assets` int(11) DEFAULT NULL,
`equity` int(11) DEFAULT NULL,
`year` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I used the same data as you.
Good luck!

SQL query for obtain correct retarded dept in a report

There's a payment table with these fields:
Dossier_id
Year
Amount
Payed
Retarded
( all fields are numeral )
Imagine the table with these entries:
Dossier_id || Year || Amount || Payed || Retarded
==================================================================
1000 2010 500 100 400
2000 2007 700 500 200
1000 2011 1200 700 500
2000 2009 900 800 100
==================================================================
Total || 3300 2100 600
==================================================================
How can i write a query to calculate (only) the end section of that table (Total) to achieve the correct Retarded rate as total liability? ( i played with inner join but i couldn't figure it well)
This should do it:
SELECT SUM(a.amount), sum(a.payed), b.retarded
FROM table a INNER JOIN (SELECT SUM(retarded) AS retarded
FROM table b INNER JOIN (SELECT dossier_id, MAX(year) FROM table) c ON b.dossier_id = c.dossier_id AND b.year = c.year) b ON 1 = 1
EDIT (stupid access - join not supported issue):
SELECT SUM(a.amount), sum(a.payed), b.retarded
FROM table a INNER JOIN (SELECT SUM(retarded) AS retarded
FROM table b INNER JOIN (SELECT dossier_id, MAX(year) FROM table) c ON (b.dossier_id = c.dossier_id AND b.year = c.year)) b ON (1 = 1)
Its this simple query:
SELECT SUM(AMOUNT), SUM(PAYED), SUM(RETARDED) FROM TABLE;

Is it possible to write single query on two table which are not connected to each other?

I have two tables, I am wondering whether is it possible to write a single query on these two table but they are not connected to each other?
some sample code snippet whould be great helpful for my understanding.
Table: Payment
Payment_id Payment_status amount
1 1001 201 400
2 1002 403 450
3 1003 204 460
after running query : SELECT Payment_status FROM Payment GROUP BY Payment_staus
it gives me result like :
Payment_staus
1 201
2 403
3 204
I have one more table named status_code as
code description
1 201 In progress
2 403 Complete
3 204 On Hold
In above query I want Payment_staus and their respective description , the result should look like this
Payment_status description
1 201 In progress
2 403 Complete
3 204 On Hold
A Cartesian join (note there is no JOIN condition). All possible combinations of records are in the results:
tableA (charfield Char(2))
tableB (numberfield Number(1))
INSERT 'A' INTO tableA;
INSERT 'B' INTO tableA;
INSERT 1 INTO tableB;
INSERT 2 INTO tableB;
SELECT *
FROM tablea CROSS JOIN tableb
Results:
charfield|numberfield
=====================
A |1
A |2
B |1
B |2
SELECT p.payment_id, p.Payment_status, s.description
FROM Payment p
JOIN status_code s
ON p.Payment_status = s.code
This uses a SQL 'join' to connect the two tables on the status_code table's code property.
This will give you results like
Payment_id Payment_status description
1001 201 In progress
1002 403 Complete
1003 204 On Hold
You can use a UNION query, but the field types/column counts in both sub-queries must match:
SELECT a, b, c
FROM table1
UNION
SELECT p, q, r
FROM table2
The alternative is simply doing a full cartesian join, which can return HUGE result sets if the two tables have a large number of rows - you'll be getting n x m rows