Select columns value based on logic applied on the rows

Select columns value based on logic applied on the rows - sql

It is a self-join I need, but I'm having difficulty with this problem and I hope someone can help me.
I have a table with MAT_CODE, MATERIAL and VENDOR and I am trying to generate a new column with NEW_MATCODE as per the below scenario.
Sample Data :
NEW_MATCODE MAT_CODE MATERIAL VENDOR WIN_VENDOR
X-043223065 GP002134 GP002134
3065 X-043223065 USD005 P10011
3065 3065 X-043223065 EUR003 P10011
4567 4567 X-023065 UD00005 UD00005
4567 X-023065 DF00388 UD00005
4321 X-04065 P24005 P24005
4321 4321 X-04065 D41111 P24005
4321 X-04065 D46732 P24005
X-0432065 US7800 D0230005
X-0432065 EUR234 D067805
123 123 X-04322 P0008 P0008
123 1234 X-04322 EU0323 P0008
123 1262 X-04322 EUR0032 P0008
2345 2345 X-04322 DFGH322 P12008
123456 123456 X-04322 EUR00323 P12008
1113 1113 X-04322 EUR0032 P12008
Logic for 1,2 and 3 sets of data:
Pick up the MATERIAL AND WIN_VENDOR combination and get the unique MAT_CODE and apply it across all MATERIAL- WIN_VENDOR combinations as the NEW_MATCODE
Logic for 4th set :
If no combination for MAT_CODE exists then leave it as-is
Logic for 5th set:
When different MAT_CODE exists for the same MATERIAL and WIN_VENDOR combination, apply NEW_MATCODE as the MAT_CODE from MATERIAL - VENDOR where VENDOR = WIN_VENDOR
Logic for 6th set:
When different MAT_CODE exists for the same MATERIAL and WIN_VENDOR combination, and VENDOR <> WIN_VENDOR leave MAT_CODE as-is.
Hope it is clear. Any help would be appreciated.
Thanks.

I think the following query will get you most of the way to what you are looking for:
SELECT mat_code, material, vendor, win_vendor,
CASE
WHEN COUNT(DISTINCT mat_code) OVER (PARTITION BY material, win_vendor) = 0 THEN mat_code
WHEN COUNT(DISTINCT mat_code) OVER (PARTITION BY material, win_vendor) = 1 THEN MAX(mat_code) OVER (PARTITION BY material, win_vendor)
ELSE NVL((SELECT sub.mat_code FROM material_info sub WHERE sub.material = mi.material AND sub.vendor = sub.win_vendor), mi.mat_code)
END AS NEW_MAT
FROM material_info mi;
The case statement is making use of the analytical functions to handle cases 1-4. The else branch is attempting to grab Case 5 and if it isn't found defaulting to Case 6.

Related

How to get the set size, first and last record in a db2 ordered set with one call

I have a very big transaction table on DB2 v11, and I need to query a subset of it as efficiently as possible. All I need is the total count of the set (not known in advance, it's based on criteria, lets say 1 day) and the ID of the first record, and the ID of the last record.
The old code was fetching the entire table, then just using the 1st record ID, and the last record ID, and size, and not making use of the rest. Now this code is timing out. It's a complex query of several joins.
IS there a way to just fetch the size of the set, 1st record, last record all in one select query ?
I've read that reordering the list in order to fetch the 1st record(so fetch with Desc, then change to Asc) is not efficient.
sample table 1 TRANSACTION_RECORDS:
tdID TIMESTAMP name
-------------------------------
123 2020-03-31 john
234 2020-03-31 dan
456 2020-03-01 Eve
675 2020-04-01 joy
sample table 2 TRANSACTION_TYPE:
invoiceId tdID account
------------------------------
897 123 abc
898 123 def
877 234 mnc
899 456 opp
Sample query
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
group by tr.tdID
order by TR.tdID ASC
This results in multiple columns, (but it requires the group by)
123,123
234,234
456,456
What I want is:
123,456

As I mentioned in the comments, for this query you don't need Group BY and neither Order by, just do:
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
It should work as expected

Combining Data from Two Tables into One Row

I've read around quite a bit for a solution to my problem but I can't seem to get it to work. It seems like a simple problem but I'm not getting the result set I want.
I'm working on a report that needs to pull from two tables and essentially create one row of data for each employee. The file needs to be uploaded to a healthcare vendor.
Here is an example of the data
Table1: EmployeeCheckDeduction
Employee ID Deduction Amount Check Date
1234 50.00 6/30/2015
1234 50.00 7/15/2015
4567 100.00 6/30/2015
4567 100.00 7/15/2015
9876 75.00 6/30/2015
9876 75.00 7/15/2015
Table2: EmployerContribution
Employee ID Contribution Amount Check Date
1234 25.00 6/30/2015
1234 30.00 7/15/2015
4567 50.00 6/30/2015
4567 60.00 7/15/2015
Part of the problem is that not every record in Table1 will have a corresponding match in Table 2. If they are maxed out on contributions, they won't receive one on that pay. What I want is a result set that looks like this:
Employee ID Deduction Amount Contribution Amount Check Date
1234 50.00 25.00 6/30/2015
1234 50.00 30.00 7/15/2015
4567 100.00 50.00 6/30/2015
4567 100.00 60.00 7/15/2015
9876 75.00 0.00 6/30/2015
9876 75.00 0.00 7/15/2015
No matter how I try and join, it's just duplicating data. I've tried using subqueries or distinct records and no matter what I try, it's not giving me what I want. I can't figure out what I'm doing wrong.
Edit. See links below for dataset results.
http://s000.tinyupload.com/index.php?file_id=01551050904538574848
http://s000.tinyupload.com/index.php?file_id=63978789937644749322
http://s000.tinyupload.com/index.php?file_id=28700836121558977952
I think part of the problem is that in the Employee Check Deduction table there is a specific deduction code that I'm pulling out. In the employer deduction table that code also exists. However, whenever I try and add the join on those 2 fields in addition to employee ID and check date, it doesn't return results from the employees who have a deduction amount in the employee check deduction table when they don't have a corresponding record in the Employer Contribution Table. I hope that helps.

select a.employee_id, a.deduction_amount, b.contribution_amount,
a.check_date
from employeecheckdeduction a join employercontribution b
on a.employee_id = b.employee_id
and a.check_date = b.check_date
Try this by renaming the columns if necessary. You should join on employed and checkdate.

SELECT a.EMPLID as EmployeeID, a.AL_AMOUNT as EmployeeContribution, ISNULL(b.AL_AMOUNT, 0) AS EmployerContribution, a.CHECK_DT
FROM PS_AL_CHK_DED as a
LEFT JOIN PS_AL_CHK_MEMO as b ON a.EMPLID = b.EMPLID AND a.CHECK_DT = b.CHECK_DT
WHERE a.AL_DEDCD = 'H'
ORDER BY a.CHECK_DT

The problem here is that you are dealing with TWO different things:
a) You want a FULL list of all employees who have ever had a contribution AND/OR a deduction
b) You then want to display it in one row
While there are many ways to go about this, the first thing would be two separate out those two and handle them.
Also it is not clear if you want to GROUP BY the employee ID or if you want to see a full history. If you are to look at history, the best thing would be using a FULL JOIN such as:
select coalesce(d.employee_id,c.employee_id) employee_id
, d.deduction_amount
, c.contribution_amount
, coalesce(d.deducation_date,c.contribution_date) check_date
from employee_deduction d
full join employee_contribution c on c.employee_id = d.employee_id
where d.DED_CD = 'H' or c.MEMO_CD = 'H'
What a FULL JOIN does is include ALL records from the first table and ALL records from the joining table regardless if there are nulls in either table.
That is why you have to use a coalesce statement in the select clause to make sure you don't wind up with a null value in the result set.
Hope this works.

Joining a table to two one-to-many relationship tables in SQL Server

Happy Friday folks,
I'm trying to write an SSRS report displaying data from three (actually about 12, but only three relevant) tables that have akward relationships and the SQL query behind the data is proving difficult.
There are three entities involved - a Purchase Order, a Sales Order, and a Delivery. The problem is the a Purchase Order can have many sales orders, and also many deliveries which are NOT linked to the sales orders...that would be too easy.
Both the Sales Order and Delivery tables can be linked to the Purchase Order table by foreign keys and an intermediate table each.
I need to basically list Purchase Orders, a list of sales orders and a list of deliveries next to them, with NULLs for any fields that aren't valid so that'll give the required output in SSRS/when read by a human, ie, for a purchase order with 2 sales orders and 4 delivery dates;
PO SO Delivery
1234 ABC 05/10
1234 DEF 09/10
1234 NULL 10/12
1234 NULL 14/12
The above (when grouped by PO) will tell the users there are two sales orders and four (unlinked) delivery dates.
Likewise if there are more SOs than deliveries, we need NULLs in the Delivery column;
PO SO Delivery
1234 ABC 03/08
1234 DEF NULL
1234 GHI NULL
1234 JKL NULL
Above would be the case with 4 SOs and one delivery date.
Using Left Outer joins alone gives too much duplication - in this case 8 rows, as it gives 4 delivery dates for each match on the sales order;
PO SO Delivery
1234 ABC 05/10
1234 ABC 09/10
1234 ABC 10/12
1234 ABC 14/12
1234 DEF 05/10
1234 DEF 09/10
1234 DEF 10/12
1234 DEF 14/12
It's fine that the PO column is duplicated as SSRS can visually group that - but the SO/Delivery fields can't be allowed to duplicate as this can't be got rid of in the report - if I group the column in SSRS by SO then it still spits out 4 delivery dates for each one.
The only situation our query works nice is when there is just one SO per PO. In that case the single PO and SO numbers are duplicated together for x deliveries and can both be neatly grouped in SSRS. Unfortunately this is a rare occurence in the data.
I've thought of trying to use some sort of windowing function or CROSS APPLY but both fall down as they will repeat for every PO number listed and end up spitting out too much data.
At the point of thinking this just isn't set-based enough to be doable in SQL, I know the data is horrible..
Any help much appreciated.
EDIT - basical sqlfiddle link to the table schemas. Omitted many columns which aren't relevant. http://sqlfiddle.com/#!2/5ba16
Example data...
Purchase Order
PO_Number Style
1001 Black work boots
1002 Green hat
1006 Red Scarf
Sales Order
Sales_order_number PO_number Qty Retailer
A100-21 1001 15 Walmart
A100-22 1001 29 Walmart
A200-31 1006 1000 Asda
Delivery
Delivery_ID Delivery_Date PO_number
1543285 10/05/2014 1001
1543286 12/05/2014 1001
1543287 17/05/2014 1001
1543288 21/05/2014 1002

If you assign row numbers to the elements in salesorders and deliveries, you can link on that.
Something like this
declare #salesorders table (po int, so varchar(10))
declare #deliveries table (po int, delivery date)
declare #purchaseorders table (po int)
insert #purchaseorders values (123),(456)
insert #salesorders values (123,'a'),(123,'b'),(456,'c')
insert #deliveries values (123,'2014-1-1'),(456,'2014-2-1'),(456,'2014-2-1')
select *
from
(
select numbers.number, p.po, so.so, d.delivery from #purchaseorders p
cross join (Select number from master..spt_values where type='p') numbers
left join (select *,ROW_NUMBER() over (partition by po order by so) sor from #salesorders ) so
on p.po = so.po and numbers.number = so.sor
left join (select * , ROW_NUMBER() over (partition by po order by delivery) dor from #deliveries) d
on p.po = d.po and numbers.number = d.dor
) v
where so is not null or delivery is not null
order by po,number

SQL select output to XML

Trying to do a select in SQL Server 2005 and send the output to xml. Table 2 is a general use table with various types of info. Some product info is in there if it's type 2, it's a sales lead if it's type 1. We can have multiple sales leads and products for each case_num from table 1.
Table 1
case_num,
date
table 2 (general use)
case_num,
rec_type (1=sales lead; 2=product),
various info based on type in generic columns =
col_a,
col_b,
I'm trying something like:
select
case.case_num
,case.date
,product.col_a as product_name
,product.col_b as product_price
,lead.col_a as sales_lead_name
,lead.col_b as sales_lead_address
from
table_1 case
,table_2 product
,table_2 lead
where
(case.case_num = product.case_num AND product.rec_type = 2)
OR
(case.case_num = lead.case_num AND lead.rec_type = 1)
for xml auto, elements
This is bringing back results like
<case>
<case_num>1</case_num>
<date>1/1/2013</date>
<product>
<product_name>name</product_name>
<product_price>1.00</product_price>
<lead>
<sales_lead_name>bob smith</sales_lead_name>
<sales_lead_address>address 1</sales_lead_address>
</lead>
</product>
<product>
<product_name>name2</product_name>
<product_price>2.00</product_price>
<lead>
<sales_lead_name>bob smith</sales_lead_name>
<sales_lead_address>address 1</sales_lead_address>
</lead>
</product>
</case>
I don't want the name repeating for every product. With multiple products and multiple leads, how do I format the SQL so it doesn't make sort of a Cartesian product in my results?
I made another example to illustrate my problem. SQL Fiddle example
This is making a cartesian result, matching all parts to all persons. I want to have one case then each part then each person, then close case.
I was trying DISTINCT and getting errors. I thought about UNION to tie two together, but I don't think I can do that within a bigger select for my case.
What I’m getting:
CASE_NUM DATE PART_NAME PART_PRICE PERSON_NAME COMPANY
1 2013-01-01 stapler 1.00 bob smith acme supplies
1 2013-01-01 matches 2.00 bob smith acme supplies
1 2013-01-01 stapler 1.00 john doe john supply inc
1 2013-01-01 matches 2.00 john doe john supply inc
What I want:
CASE_NUM DATE PART_NAME PART_PRICE PERSON_NAME COMPANY
1 2013-01-01 bob smith acme supplies
1 2013-01-01 john doe john supply inc
1 2013-01-01 matches 2.00
1 2013-01-01 stapler 1.00

As #marc_s points out, you create your Cartesian product yourself by 'joining' the tables the way you do. Always try to use JOIN instead.
I believe the following query would fit you needs:
select
[case].case_num
,[case].date
,lead.col_a as sales_lead_name
,lead.col_b as sales_lead_address
,product.col_a as product_name
,product.col_b as product_price
from
table_1 [case]
JOIN table_2 lead ON [case].case_num = lead.case_num
AND lead.rec_type = 1
JOIN table_2 product ON [case].case_num = product.case_num
AND product.rec_type = 2
FOR XML auto, elements;
You can view it on SQLFiddle.com
The output will look like this:
<case>
<case_num>1</case_num>
<date>2013-01-01</date>
<lead>
<sales_lead_name>bob smith</sales_lead_name>
<sales_lead_address>address 1</sales_lead_address>
<product>
<product_name>name</product_name>
<product_price>1.00</product_price>
</product>
<product>
<product_name>name2</product_name>
<product_price>2.00</product_price>
</product>
</lead>
</case>

A friend suggested only joining once, then filtering the select based on case statements and I think this is going to work. Thanks folks
select case_num = case
when child.rec_type = '1' then mast.case_num
when child.rec_type = '2' then mast.case_num
else '' end
,mast_date = case
when child.rec_type = '1' then mast.date
when child.rec_type = '2' then mast.date
else '' end
,child.rec_type
,part_name = case when child.rec_type = '1' then child.col_a else '' end
,part_price = case when child.rec_type = '1' then child.col_b else '' end
,subject_name = case when child.rec_type = '2' then child.col_a else '' end
,subject_type = case when child.rec_type = '2' then child.col_b else '' end
from table_master mast
join table_child child on mast.case_num = child.case_num
--for xml auto, elements;

Since no one has answered the question I have done something similar in the past I cant exactly remember how I did it but I will give you something to play with its really hard to guess things when you dont have data available , as far as I remember I did something like this to get the format you are after and it was on SQL Server 2005 so it should work for you
select case.case_num, case.date,
(SELECT col_a [#productname]
,col_b [#productprice]
FROM table_2 t2
WHERE t2.case_num = case.case_num
FOR XML PATH('Details'), TYPE)
from table_1 case
FOR XML PATH('Case'), ROOT('Cases')

Optimal solution for interview question

Recently in a job interview, I was given the following problem.
Say I have the following table
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
b | 30.00 | 1
c | 20.00 | 1
d | 25.00 | 1
where widget_name is holds the name of the widget, widget_costs is the price of a widget, and in stock is a constant of 1.
Now for my business insurance I have a certain deductible. I am looking to find a sql statement that will tell me every widget and it's price exceeds the deductible. So if my dedudctible is $50.00 the above would just return
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
d | 25.00 | 1
Since widgets b and c where used to meet the deductible
The closest I could get is the following
SELECT
*
FROM (
SELECT
widget_name,
widget_price
FROM interview.tbl_widgets
minus
SELECT widget_name,widget_price
FROM (
SELECT
widget_name,
widget_price,
50 - sum(widget_price) over (ORDER BY widget_price ROWS between unbounded preceding and current row) as running_total
FROM interview.tbl_widgets
)
where running_total >= 0
)
;
Which gives me
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
c | 20.00 | 1
d | 25.00 | 1
because it uses a and b to meet the majority of the deductible
I was hoping someone might be able to show me the correct answer
EDIT: I understood the interview question to be asking this. Given a table of widgets and their prices and given a dollar amount, substract as many of the widgets you can up to the dollar amount and return those widgets and their prices that remain

I'll put an answer up, just in case it's easier than it looks, but if the idea is just to return any widget that costs more than the deductible then you'd do something like this:
Select
Widget_Name, Widget_Cost, In_Stock
From
Widgets
Where
Widget_Cost > 50 -- SubSelect for variable deductibles?
For your sample data my query returns no rows.

I believe I understand your question, but I'm not 100%. Here is what I'm assuming you mean:
Your deductible is say, $50. To meet the deductible you have you "use" two items. (Is this always two? How high can it go? Can it be just one? What if they don't total exactly $50, there is a lot of missing information). You then want to return the widgets that aren't being used towards deductible. I have the following.
CREATE TABLE #test
(
widget_name char(1),
widget_cost money
)
INSERT INTO #test (widget_name, widget_cost)
SELECT 'a', 15.00 UNION ALL
SELECT 'b', 30.00 UNION ALL
SELECT 'c', 20.00 UNION ALL
SELECT 'd', 25.00
SELECT * FROM #test t1
WHERE t1.widget_name NOT IN (
SELECT t1.widget_name FROM #test t1
CROSS JOIN #test t2
WHERE t1.widget_cost + t2.widget_cost = 50 AND t1.widget_name != t2.widget_name)
Which returns
widget_name widget_cost
----------- ---------------------
a 15.00
d 25.00

This looks like a Bin Packing problem these are really hard to solve especially with SQL.
If you search on SO for Bin Packing + SQL, you'll find how to find Sum(field) in condition ie “select * from table where sum(field) < 150” Which is basically the same problem except you want to add a NOT IN to it.
I couldn't get the accepted answer by brianegge to work but what he wrote about it in general was interesting
..the problem you
describe of wanting the selection of
users which would most closely fit
into a given size, is a bin packing
problem. This is an NP-Hard problem,
and won't be easily solved with ANSI
SQL. However, the above seems to
return the right result, but in fact
it simply starts with the smallest
item, and continues to add items until
the bin is full.
A general, more effective bin packing
algorithm would is to start with the
largest item and continue to add
smaller ones as they fit. This
algorithm would select users 5 and 4.
So with this advice you could write a cursor to loop over the table to do just this (it just wouldn't be pretty).
Aaron Alton gives a nice link to a series of articles that attempts to solve the Bin Packing problem with sql but basically concludes that its probably best to use a cursor to do it.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas