Left Outer Join with subqueries IN/EXIST at Hive - sql

All, so I am trying to run the query.
The query consist of 7 tables and I want to all table get left joined based on A.conn_keyy and the others with clause 'ON'
My confusion comes when I want to join CPLCUR based on A , not works.
(CPLCUR.conn_keyy in ( a.conn_keyy = b.conn_keyy )
It appears error :
both left and right aliases encountered in join 'conn_key'
set hive.support.quoted.identifiers=none;
select
coalesce(a.conn_keyy, b.conn_keyy,CPLCUR.conn_keyy) as rrconn_keyy,
b.rfbbn, b.LINES_ID,b.TYPE,CPLCUR.*
FROM
(tablee.aa)A
LEFT OUTER JOIN
(tablee.bb) B
ON (A.conn_keyy = B.conn_keyy)
LEFT OUTER JOIN (SELECT `(c21)?+.+` FROM tablee.cc ) CPLCUR
ON (CPLCUR.conn_keyy in ( a.conn_keyy = b.conn_keyy )
AND CPLCUR.cllt = REGEXP_EXTRACT(B.rfbbn,'^(?:[^*]*\\*){2}([^*]*)',1))
LEFT OUTER JOIN (SELECT DISTINCT * FROM tablee.dd) CPLBAL
ON CPLBAL.conn_keyy = A.conn_keyy
AND CPLBAL.SEQUENCE = CPLCUR.SEQUENCE
AND CPLBAL.dtdt = '1999'
LEFT OUTER JOIN
(tablee.REP)REP
ON REP.relino = B.lnido
LEFT OUTER JOIN tablee.P PRD
ON PRD.PRODUCT_CODE = REGEXP_EXTRACT(A.conn_keyy,'[.]([^.]+)',1)
AND PRD.dtdt = '1999'
WHERE B.lnido LIKE 'PLCONS1%'
) rrvv;
What is best practice to get this?
The desired results:
+-----------+---------+--------+----------+-------------+-------+-----+-----+
| conn_keyy | b.rfbbn | b.LINES| b.TYPE | CPLCUR | CPLBAL| REP | PRD |
+-----------+---------+--------+----------+-------------+-------+-----+-----+
| 111 | aaa | PCOS1% | bbsr | 2019-02-21 | | | |
| 200 | | PCOS1% | ny | X | | | |
| 222 | bbb | PCOS1% | pp | Y | | | |
| 300 | rrr | PCOS1% | atl | 2019-03-18 | | | |
| 333 | ccc | PCOS1% | dd | Z | | | |
| 400 | vvv | PCOS1% | tt | 2019-03-18 | | | |
+-----------+---------+--------+----------+-------------+-------+-----+-----+

Related

select count of sold products with 2 attributes on different rows

I am trying to generate a report of every product sold of SKUABC in size 34 with inseam 33 (it is available in 33 and 31 inseam).
Table - orders_products
Table - Orders:
+-----------+------------------------+--+
| Orders_id | date_purchased | |
+-----------+------------------------+--+
| 46198 | 2020-10-18 19:43:25 | |
| 46199 | 2020-10-19 19:43:25 | |
| 46200 | 2020-10-22 19:43:25 | |
+-----------+------------------------+--+
Table - orders_products
+--------------------+-----------+-------------+----------------+--+
| orders_products_id | Orders_id | products_id | products_mode | QTY
+--------------------+-----------+-------------+----------------+--+
| 42154907 | 46198 | 878 | SKUABC |1 |
| 42154908 | 46198 | 878 | SKUABC |1 |
| 42154909 | 46198 | 282 | DIFFSKU |1 |
+--------------------+-----------+-------------+----------------+--+
Table - Orders_products_attributes (showing order_id 46198 only):
+------------------------------+-----------+--------------------+-----------------+-----------------------+--+
| orders_products_attribute_id | orders_id | orders_products_id | Product options | Product_options_value | |
+------------------------------+-----------+--------------------+-----------------+-----------------------+--+
| 167618 | 46198 | 42155189 | Color | Green | |
| 167619 | 46198 | 42155189 | Inseam | 33 | |
| 167620 | 46198 | 42155189 | Size | 34 | |
+------------------------------+-----------+--------------------+-----------------+-----------------------+--+
my sql so far:
SELECT distinct o.orders_id, op.products_model, opa.products_options_values, sum(op.products_quantity)
FROM orders o
LEFT JOIN orders_products op
ON o.orders_id = op.orders_id
LEFT JOIN orders_products_attributes opa
on op.orders_id = opa.orders_id
WHERE op.products_model in ('SKUABC')
and opa.`products_options_values` in ('36')
and o.date_purchased > '2020-10-13'
If I add in :
and opa.`products_options_values` in ('31')
it returns no results, the reason being because the inseam and size rows are separate. and the problem with the above code is that it is combining any orders/ordered products where the inseam is both 33 or 31 but I want it to be separate.
My desired out would be
+--------+------------+------------+-------------------+
| model | attribute1 | attribute2 | quantity sold sum |
+--------+------------+------------+-------------------+
| ABCSKU | 34 | 33 | 120 |
+--------+------------+------------+-------------------+
Here is a fun solution: select two products_options_values and label those in different name then everything will be easy
SELECT distinct o.orders_id, op.products_model, opa.products_options_values
AS Inseem,opa2.products_options_values AS Size, sum(op.products_quantity)
FROM orders o
LEFT JOIN orders_products op
ON o.orders_id = op.orders_id
LEFT JOIN orders_products_attributes opa
ON op.orders_id = opa.orders_id
LEFT JOIN orders_products_attributes opa2
ON op.orders_id = opa.orders_id
--your condition below
then just use opa for inseem and opa2 for size.It is stupid but work.You can even make the rows data null by adding some condition with Product option column for easier insert later.

SQL Question Looking Up Value in Same Table

Trying to use a self join in SQL to look up a value in the table and apply it.
Her's what I got:
+-----------------+-----+--------+-----------+
| Acutal Output | | | |
+-----------------+-----+--------+-----------+
| TRKID | Fac | NewFac | BAG_TRKID |
| 449 | 11 | 11 | 999 |
| 473 | 11 | 11 | 737 |
| 477 | 11 | 11 | 737 |
| 482 | 11 | 11 | 737 |
| 737 | 89 | 89 | |
| Desired Out Put | | | |
| TRKID | Fac | NewFac | BAG_TRKID |
| 449 | 11 | 11 | 999 |
| 473 | 11 | 89 | 737 |
| 477 | 11 | 89 | 737 |
| 482 | 11 | 89 | 737 |
| 737 | 89 | 89 | |
+-----------------+-----+--------+-----------+
Here's the code below. I can't seem to get the table that I want. The Bag TrkID's Facility Num is not becoming the TrkID New Facility Num.
Select
TABLEA.TRKID,
TABLEA.FAC,
NVL(TABLEA.FAC, TABLEB.FAC) as NEWFAC,
TABLEA.BAG_TRKID
FROM
(
Select
HSD. TRKID,
HSD.NLPT as FAC,
SBPD.BAG_TRKID
From
HSD
LEFT JOIN
SBPD
ON
SBPD.BAG_TRKID = HSD. TRKID
Where
HSD.SCANDT BETWEEN ‘Yesterday’ and ‘Today’
) TABLEA
LEFT JOIN
(
Select
HSD. TRKID,
HSD.NLPT as FAC,
SBPD.BAG_TRKID
From
HSD
LEFT JOIN
SBPD
ON
SBPD.BAG_TRKID = HSD. TRKID
Where
HSD.SCANDT BETWEEN ‘Yesterday’ and ‘Today’
) TABLEB
ON
TABLEA.TRKID = TABLEB.BAG_TRKID
Perhaps something like
select a.TrkID, a."Facility Number", a.BAG_TRKID, b.TrkID as "NEW Fac"
from tbl a
left join tbl b on (a.TrkID = b.trk_id_reference)
Given the limited information that you've shared, I was able to achieve the expected output with the following query:
SELECT a.TrkID, a.facility_number, a.bag_trkid, b.facility_number as new_facility_number
FROM test_tbl AS a
LEFT JOIN test_tbl AS b ON a.bag_trkid = b.trkid OR (a.bag_trkid IS NULL AND b.trkid = a.trkid);
You want to get the new_facility_number for a row based on its bag_trkid (which can be achieved by this: LEFT JOIN test_tbl AS b ON a.bag_trkid = b.trkid).
BUT the trick is to account for the cases when the Left Table (which I refer as a) does not have a bag_trkid. In this case, we will keep the new_facility_number to be the same as a.facility_number, joining the tables on the trkid solely: OR (a.bag_trkid IS NULL AND b.trkid = a.trkid)

SQL Left join where data missing in join column

I'm not even sure this is possible but I'm trying to join two tables together but I'm not getting my expected results. My query is as follows:
SELECT inc.NUMBER as TICKET,
inc.UNIV_NUM,
inc.ASSIGNEE,
work.SUBMIT_DATE
work.TYPE
FROM dbo.HELP_DESK as inc
LEFT JOIN dbo.WORKLOG as work on inc.NUMBER = work.NUMBER
Where inc.ASSIGNEE = 'AB049732'
and work.SUBMIT_DATE = (Select MAX(work2.SUBMIT_DATE)
from dbo.WORKLOG as work2
where work2.NUMBER = work.NUMBER
and work2.TYPE = '16000')
My tables look like this
inc
+---------+-----------+----------+
| NUMBER | UNIV_NUM | ASSIGNEE |
+---------+-----------+----------+
| 100001 | 4321781 | AB049732 |
| 100002 | 4232756 | AB049732 |
| 100003 | 4322534 | AB049732 |
| 100004 | 4328534 | AB049732 |
+---------+-----------+----------+
work
+--------+------------+-------+
| NUMBER | DATE | TYPE |
+--------+------------+-------+
| 100001 | 23/05/2018 | 16000 |
| 100003 | 22/05/2018 | 16000 |
| 100004 | 22/05/2018 | 16000 |
+--------+------------+-------+
My expected output is:
+--------+----------+----------+------------+-------+
| NUMBER | UNIV_NUM | ASSIGNEE | DATE | TYPE |
+--------+----------+----------+------------+-------+
| 100001 | 4321781 | AB049732 | 23/05/2018 | 16000 |
| 100002 | 4232756 | AB049732 | NULL | NULL |
| 100003 | 4322534 | AB049732 | 22/05/2018 | 16000 |
| 100004 | 4328534 | AB049732 | 22/05/2018 | 16000 |
+--------+----------+----------+------------+-------+
But my actual output is:
+---------+-----------+----------+------------+-------+
| NUMBER | UNIV_NUM | ASSIGNEE | DATE | TYPE |
+---------+-----------+----------+------------+-------+
| 100001 | 4321781 | AB049732 | 23/05/2018 | 16000 |
| 100003 | 4322534 | AB049732 | 22/05/2018 | 16000 |
| 100004 | 4328534 | AB049732 | 22/05/2018 | 16000 |
+---------+-----------+----------+------------+-------+
Effectively, number 100002 isn't displaying despite being in the inc table. Am I doing something wrong or is this a case of you can't join to something that doesn't exist?
Your join condition is bad. Try this:
SELECT inc.NUMBER as TICKET,
inc.UNIV_NUM,
inc.ASSIGNEE,
work.SUBMIT_DATE
work.TYPE
FROM dbo.HELP_DESK as inc
LEFT JOIN dbo.WORKLOG as work on inc.NUMBER = work.NUMBER
and work.SUBMIT_DATE = (Select MAX(work2.SUBMIT_DATE)
from dbo.WORKLOG as work2
where work2.NUMBER = work.NUMBER
and work2.TYPE = '16000')
Where inc.ASSIGNEE = 'AB049732'
See the difference? If you put the work.SUBMIT_DATE = ... condition in the Where clause (as you did) then your join becomes an inner join. But you want an outer join.
Use window functions!
SELECT h.NUMBER as TICKET, h.UNIV_NUM, h.ASSIGNEE,
w.SUBMIT_DATE, w.TYPE
FROM dbo.HELP_DESK h LEFT JOIN
(SELECT w.*, MAX(w2.SUBMIT_DATE) OVER (PARTITION BY w.NUMBER) as max_submit_date
FROM dbo.WORKLOG w
WHERE w.TYPE = '16000'
) w
ON h.NUMBER = w.NUMBER AND w.submit_date = w.max_submit_date
WHERE h.ASSIGNEE = 'AB049732';
This is subtly different from your query, but I think it is the logic you actually want. Your query will find records that have the maximum submit date for type '16000' regardless of type. I presume that you really want the types to align to the submit date.
If this interpretation is wrong, it is easy to adjust the query:
SELECT h.NUMBER as TICKET, h.UNIV_NUM, h.ASSIGNEE,
w.SUBMIT_DATE, w.TYPE
FROM dbo.HELP_DESK h LEFT JOIN
(SELECT w.*,
MAX(CASE WHEN w.TYPE = '16000' THEN w2.SUBMIT_DATE END) OVER (PARTITION BY w.NUMBER) as max_submit_date
FROM dbo.WORKLOG w
) w
ON h.NUMBER = w.NUMBER AND w.submit_date = w.max_submit_date
WHERE h.ASSIGNEE = 'AB049732';
These versions are not only simpler, but they should have better performance as well.
You can use a CTE like this:
WITH WorkDates
AS (SELECT SUBMIT_DATE,TYPE
From WORKLOG work
Where SUBMIT_DATE = (Select MAX(work2.SUBMIT_DATE)
from dbo.WORKLOG as work2
where work2.NUMBER = work.NUMBER
and work2.TYPE = '16000'))
SELECT inc.NUMBER as TICKET,
inc.UNIV_NUM,
inc.ASSIGNEE,
WorkDates.SUBMIT_DATE
WorkDates.TYPE
FROM dbo.HELP_DESK as inc
LEFT JOIN WorkDates on inc.NUMBER = WorkDates.NUMBER
Where inc.ASSIGNEE = 'AB049732'

Distinct code wont work SQL Server

SELECT DISTINCT
U.Unit_ID, P.Plant_ID, P.Project_NR, U.Key_code_list,
S.Status_type, Kc.Key_codes4
FROM
Plant as P
INNER JOIN
Unit as U ON P.Plant_NR = U.Plant_NR
INNER JOIN
[dbo].[Key_code_list] as Kcl ON P.Project_NR = Kcl.Project_NR
INNER JOIN
Status_codes as S ON S.Status_nr = Kcl.Status
INNER JOIN
Key_codes as Kc ON Kc.Key_code_ID = Kcl.Key_code_list_ID
I have this code, and it will give me not the outcome I hoped for. I know it's probably something easy but I've been banging my head against the wall now for an hour and thought why not ask you guys.
The outcome now is:
Unit_ID | Plant_ID | Project_NR | Key_code_list | Status_type | Key_code_4 | Key_code _ID
-----------------------------------------------------------------------------------------------
MEOD | SM | 114015 | 4 | Assigned | AC49 | 11 |
MLO | SM | 114015 | 4 | Assigned | AC49 | 11 |
MEOD | SM | 114015 | 4 | Assigned | AC47 | 12 |
MLO | SM | 114015 | 4 | Assigned | AC47 | 12 |
The outcome is now twice as a result but that's not correct. I would love to get the answer just once. Can someone please help me?
The desired outcome is :
MEOD | SM | 114015 | 4 | Assigned | AC49 | 12 |
MLO | SM | 114015 | 4 | Assigned | AC47 | 11 |
SELECT * FROM dbo.Key_Code_List WHERE Project_NR = '114015'
Key_code_list | Status | Plant_ID | Textfield_unit | Unit_ID | Key_code_list_ID | Project_NR
4 | 2 | SM | NULL | MLO | 11 | 114015
4 | 2 | SM | NULL | MEOD | 12 | 114015
You are missing the Unit_ID field in the join to the Key_Code_List table.
SELECT DISTINCT
U.Unit_ID, P.Plant_ID, P.Project_NR, U.Key_code_list,
S.Status_type, Kc.Key_codes4
FROM
Plant as P
INNER JOIN
Unit as U ON P.Plant_NR = U.Plant_NR
INNER JOIN
[dbo].[Key_code_list] as Kcl ON
P.Project_NR = Kcl.Project_NR AND
U.Unit_ID = Kcl.Unit_ID -- add this to the JOIN condition
INNER JOIN
Status_codes as S ON S.Status_nr = Kcl.Status
INNER JOIN
Key_codes as Kc ON Kc.Key_code_ID = Kcl.Key_code_list_ID

sql compact 3.5, select top n rows from each group

I am writing a query to select the top 1 record from each group. Keep in mind that I working on sql compact 3.5 and thus can not use the rank function. I'm pretty sure my query is incorrect but I'm not sure how to select top from each group. Any one got any ideas?
Here is the query I was trying to get working
/*
* added fH.InvoiceNumber to my query to get result further below.
/
select tH., t.CustomerNumber, c.CustomerName, fH.Status, fH.InvoiceNumber
from tenderHeader tH
join task t ON tH.TaskActivityID = t.ActivityID
join finalizeTicketHeader fH ON tH.FinalizeTicketTaskActivityID = fH.TaskActivityID
join customer c ON t.CustomerNumber = c.CustomerNumber
where fH.Status <> '3' AND t.TripID = '08ea6982-6efd-46fa-9753-0fd8b076f24c';
Here is what my tables look like:
customer table:
|------------------------------------------------|
| CustomerNumber | CustomerName | Address1 | ... |
|------------------------------------------------|
| 0012084737 | Customer A | 150 Rd A | ... |
|------------------------------------------------|
| 0012301891 | Customer B | 152 Rd A | ... |
|------------------------------------------------|
task table
|-----------------------------------------------------------------|
| ActivityID | TripID | TaskTypeName | Status | CustomerNumber |
|-----------------------------------------------------------------|
| 4967f6cc | 08ea6982 | Payment | 2 | 0012084737 |
|-----------------------------------------------------------------|
| e96469a1 | 08ea6982 | Payment | 2 | 0012301891 |
|-----------------------------------------------------------------|
finalizeTicketHeader table
|---------------------------------------------------|
| TaskActivityID | InvoiceNumber | Amount | Status |
|---------------------------------------------------|
| 916082c8 | 1000 | 563.32 | 3 |
|---------------------------------------------------|
| 916082c8 | 1001 | -343.68 | 0 |
|---------------------------------------------------|
| 4b38bf60 | 1002 | 152.29 | 0 |
|---------------------------------------------------|
| 4b38bf60 | 1003 | -35.80 | 0 |
|---------------------------------------------------|
tenderHeader table
|-------------------------------------------------------------------------------------|
| TaskActivityID | InvoiceNumber | PastDue | TodaysDue | FinalizeTicketTaskActivityID |
|-------------------------------------------------------------------------------------|
| 4967f6cc | 1234567891 | 23.55 | 219.64 | 916082c8 |
|-------------------------------------------------------------------------------------|
| e96469a1 | 1234567893 | 0.00 | 116.49 | 4b38bf60 |
|-------------------------------------------------------------------------------------|
the problem I was having was getting duplicates.
like so:
|------------------------------------------------------------------------------------------------------------------------------------|
| TaskActivityID | InvoiceNumber | PastDue | TodaysDue | FinalizeTicketTaskActivityID | CustomerNumber | CustomerName | InvoiceNumber |
|------------------------------------------------------------------------------------------------------------------------------------|
| 4967f6cc | 1234567891 | 23.55 | 219.64 | 916082c8 | 0012084737 | Customer A | 1001 |
|------------------------------------------------------------------------------------------------------------------------------------|
| e96469a1 | 1234567893 | 0.00 | 116.49 | 4b38bf60 | 0012301891 | Customer B | 1002 |
|------------------------------------------------------------------------------------------------------------------------------------|
| e96469a1 | 1234567893 | 0.00 | 116.49 | 4b38bf60 | 0012301891 | Customer B | 1003 |
|------------------------------------------------------------------------------------------------------------------------------------|
I've rewritten the query like so, but I need to get specific columns from the sub query.
select tH.* from tenderHeader th
inner join task t on tH.TaskActivityID = t.ActivityID
inner join (
select k.TaskActivityID from finalizeTicketHeader k group by k.TaskActivityID
) as fH on tH.FinalizeTicketTaskActivityID = fH.TaskActivityID
inner join customer c on t.CustomerNumber = c.CustomerNumber
I need to get the status from fH. Any ideas of how to do that?
select tH.*, fH.Status from tenderHeader th
inner join task t on tH.TaskActivityID = t.ActivityID
inner join finalizeTicketHeader fH on tH.FinalizeTicketTaskActivityID = tH.TaskActivityID
inner join customer c on t.CustomerNumber = c.CustomerNumber
where tH.FinalizeTicketTaskActivityID = (
select top (1) k.TaskActivityID from finalizeTicketHeader k
);
but it seems that sql compact 3.5 does not support scalar values with subquery in where cause.
Here is an example that demonstrat a way of selecting the top 1 from each group
id|time
--------
2 | 1:10
2 | 0:45
2 | 1:45
2 | 1:30
1 | 1:00
1 | 1:10
the table is called table_1; we group by id and assume that time should be desc ordered
select table_1.* from table_1
inner join (
select id, max(time) as max_time from table_1
group by id
) as t
on t.max_time = table_1.time and table_1.id = t.id
order by table_1.id
the result we get is
id|time
--------
1 | 1:10
2 | 1:45