remove hard coded values from view - sql

I have a table, tblSwaps. It looks a bit like below.
swapdate region bb_ticker swap_units
2017-01-01 EU ABC 10
2017-01-01 US ABC 40
2017-01-01 EU DEF 13
2017-01-01 US DEF 12
2017-02-20 EU ABC 8
2017-02-20 US ABC 40
2017-02-20 EU DEF 13
2017-02-20 US DEF 12
I also have another table, tblCodes
code
ABC
DEF
I then have a view, vw_SwapTotal, query below. This basically sums the number of swap units for each bb_ticker for a certain date.
SELECT swapdate, bb_ticker, SUM(swap_units) AS total_swap_units
FROM tblSwaps
GROUP BY swapdate, bb_ticker
I have created another view (this is where I have a question) shown below, which makes use of the view above (not sure if this is the best idea or not). The problem which the query below is that I have hard coded the bb_tickers (ABC, DEF) in which is not ideal as in the future there will be new bb_tickers.
SELECT *
FROM vw_SwapTotal
WHERE bb_ticker IN ('ABC', 'DEF'))
SELECT swapdate, isnull(ABC, 0) ABC, isnull(DEF, 0) DEF
FROM swp AS source PIVOT (max(total_swap_units) FOR bb_ticker IN ([ABC], [DEF])) AS pvt
What is the best way to get rid of the hard coded bb_tickers in this view?

Could this be what you want?
WHERE bb_ticker IN (select distinct bb_ticker from tblSwaps))

Related

Select row each time value changes based on date order

Good Day Hopefully I can explain my problem well enough. I do not have any sample query as I can not get anything to work. But my problem is I have a table that contains a list of accounts and their status and a date that a change occurred on the account. I need to pull the account number and each time the status changes along with the first date it changed. I have tried using rank and min(date) and max(date); the problem I am running into is the account can go back and forth between statuses and I need each time it changes. Also, a new row could be in the table with the same account number and status but the update date is different.
This is sample data:
abc C50 1/20/2022
abc C50 1/21/2022
abc C09 2/20/2022
abc C50 3/1/2022
def A54 1/20/2022
def A26 1/21/2022
def A26 2/20/2022
def A54 3/1/2022
As you can see account abc has 3 instances of C50 and one instance of C09 for my results I would expect to see, the earliest of the first two C50s, then the C09 and then next C50, since its a new status change.
abc C50 1/20/2022
abc C09 2/20/2022
abc C50 3/1/2022
For the second account def; I would expect to see the first A54, the first A26, then the next A54 on the 1st as it is a new instance.
def A54 1/20/2022
def A26 1/21/2022
def A54 3/1/2022
IMO With this kind of problem I find it better to think of a different kind of model. The query you are asking for is not easy to write (if at all possible) and would be difficult to maintain moving forward if your underlying structure changes
So what options do you have?
Option 1 Create a summary table of ACTUAL changes. Think of this table of a log/audit table where you only record a change when it happens. Then you can simply read from this table and out put results
[ChangeTable]
ABC C50 1/20/2022
ABC C09 2/20/2022
ABC C50 3/1/2022
Option 2
Create a revision column on your main table. You would start with 1 for the first instance and only increase the revision if a change happens. e.g.
REVISION
abc C50 1/20/2022 0 First instance
abc C50 1/21/2022 -1 or null if you prefer indicates no change
abc C09 2/20/2022 1 First change
abc C50 3/1/2022 2 Second change
Then you can get all revisions for an account that is greater than ZERO and ordered by revision (or desc so newest on top). You can also use this column so you can show 1st change, 2nd change etc.
Both options you would need to check for changes before you insert and append/update where appropiate.
Now you will probably say this is data I get and I cant modify the application. If so then you may be able to write a small application that is scheduled to do this work for you on a hourly/nightly basis

Create column based on grouping other values

I have difficulties formulating my issue.
I have a view which brings these results. There's a need to add a column to the view, which will pair up round-trip flights with identical number.
Flt_No From_Airport To_Airport Dep_Date RequiredResult
124 |LCA |CDG |10/19/14 5:00 1
125 |CDG |LCA |10/19/14 10:00 1
197 |LCA |BCN |10/4/12 5:00 2
198 |BCN |LCA |10/4/12 11:00 2
501 |LCA |HER |15/8/12 12:05 3
502 |HER |LCA |15/8/12 15:15 3
I.e. flight 124 is going from Larnaca to CDG, and flight 125 is going back from CDG to Larnaca - they both have to have the same identifier.
Round-trip flights will always have following flight numbers.
I have a bunch of conditions which I won't write now.
Omitting hours is not an option, they're important.
I was thinking dense_rank() but I don't know how to create one identifier for 2 flights with different numbers, please help.
If your data is similar to the sample data posted, then the following query should give the required result:
SELECT *,
DENSE_RANK() OVER (ORDER BY CASE
WHEN From_Airport < To_Airport THEN From_Airport
ELSE To_Airport
END)
FROM mytable
Join conditions are not limited to simple equality. Assuming {Flight No, Departure, Destination} is unique on any one day, then a self join should do it:
select whatever
from flights outbound
inner join flights inbound on outbound.flt_no+1 = inbound.flt_no
and cast(outbound.dep_date, date)
= cast(inbound.dep_date, date)
and outbound.From_Airport = inbound.To_Airport
and outbound.To_Airpott = inbound.From_Ariport

Optimal solution for interview question

Recently in a job interview, I was given the following problem.
Say I have the following table
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
b | 30.00 | 1
c | 20.00 | 1
d | 25.00 | 1
where widget_name is holds the name of the widget, widget_costs is the price of a widget, and in stock is a constant of 1.
Now for my business insurance I have a certain deductible. I am looking to find a sql statement that will tell me every widget and it's price exceeds the deductible. So if my dedudctible is $50.00 the above would just return
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
d | 25.00 | 1
Since widgets b and c where used to meet the deductible
The closest I could get is the following
SELECT
*
FROM (
SELECT
widget_name,
widget_price
FROM interview.tbl_widgets
minus
SELECT widget_name,widget_price
FROM (
SELECT
widget_name,
widget_price,
50 - sum(widget_price) over (ORDER BY widget_price ROWS between unbounded preceding and current row) as running_total
FROM interview.tbl_widgets
)
where running_total >= 0
)
;
Which gives me
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
c | 20.00 | 1
d | 25.00 | 1
because it uses a and b to meet the majority of the deductible
I was hoping someone might be able to show me the correct answer
EDIT: I understood the interview question to be asking this. Given a table of widgets and their prices and given a dollar amount, substract as many of the widgets you can up to the dollar amount and return those widgets and their prices that remain
I'll put an answer up, just in case it's easier than it looks, but if the idea is just to return any widget that costs more than the deductible then you'd do something like this:
Select
Widget_Name, Widget_Cost, In_Stock
From
Widgets
Where
Widget_Cost > 50 -- SubSelect for variable deductibles?
For your sample data my query returns no rows.
I believe I understand your question, but I'm not 100%. Here is what I'm assuming you mean:
Your deductible is say, $50. To meet the deductible you have you "use" two items. (Is this always two? How high can it go? Can it be just one? What if they don't total exactly $50, there is a lot of missing information). You then want to return the widgets that aren't being used towards deductible. I have the following.
CREATE TABLE #test
(
widget_name char(1),
widget_cost money
)
INSERT INTO #test (widget_name, widget_cost)
SELECT 'a', 15.00 UNION ALL
SELECT 'b', 30.00 UNION ALL
SELECT 'c', 20.00 UNION ALL
SELECT 'd', 25.00
SELECT * FROM #test t1
WHERE t1.widget_name NOT IN (
SELECT t1.widget_name FROM #test t1
CROSS JOIN #test t2
WHERE t1.widget_cost + t2.widget_cost = 50 AND t1.widget_name != t2.widget_name)
Which returns
widget_name widget_cost
----------- ---------------------
a 15.00
d 25.00
This looks like a Bin Packing problem these are really hard to solve especially with SQL.
If you search on SO for Bin Packing + SQL, you'll find how to find Sum(field) in condition ie “select * from table where sum(field) < 150” Which is basically the same problem except you want to add a NOT IN to it.
I couldn't get the accepted answer by brianegge to work but what he wrote about it in general was interesting
..the problem you
describe of wanting the selection of
users which would most closely fit
into a given size, is a bin packing
problem. This is an NP-Hard problem,
and won't be easily solved with ANSI
SQL. However, the above seems to
return the right result, but in fact
it simply starts with the smallest
item, and continues to add items until
the bin is full.
A general, more effective bin packing
algorithm would is to start with the
largest item and continue to add
smaller ones as they fit. This
algorithm would select users 5 and 4.
So with this advice you could write a cursor to loop over the table to do just this (it just wouldn't be pretty).
Aaron Alton gives a nice link to a series of articles that attempts to solve the Bin Packing problem with sql but basically concludes that its probably best to use a cursor to do it.

Making a query more efficient for reads

I have a data model like the following:
username | product1 | product2
-------------------------------
harold abc qrs
harold abc def
harold def abc
kim abc def
kim lmn qrs
...
username | friend_username
---------------------------
john harold
john kim
...
I want to build a histogram of the most frequent product1 to product2 records there are, restricted to a given product1 id, and restricted only to friends of john. So something like:
What do friends of john link to for product1, when product1='abc':
Select all of john's friends from the friends table. For each friend, count and group the number of records where product1 = 'abc', sort results in desc order:
Results:
abc -> def (2 instances)
abc -> qrs (1 instance)
I know we can do the following in a relational database, but there will be some threshold where this kind of query will start utilizing a lot of resources. Users might have a large number of friend records (500+). If this query is running 5 times every time a user loads a page, I'm worried I'll run out of resources quickly.
Is there some other table I can introduce to my model to relieve the overhead of doing the above query everytime users want to see the histogram break down? All I can think of is to precompute the histograms when possible so that reads optimized.
Thanks for any ideas
Here's your query:
SELECT p.product2,
COUNT(p.product2) AS num_product
FROM PRODUCTS p
JOIN FRIENDS f ON f.friend_username = p.username
AND f.username = 'john'
WHERE p.product1 = 'abc'
GROUP BY p.product2
ORDER BY num_product DESC
To handle 5 products, use:
SELECT p.product1,
p.product2,
COUNT(p.product2) AS num_product
FROM PRODUCTS p
JOIN FRIENDS f ON f.friend_username = p.username
AND f.username = 'john'
WHERE p.product1 IN ('abc', 'def', 'ghi', 'jkl', 'mno')
GROUP BY p.product1, p.product2
ORDER BY num_product DESC
It's pretty simple, and the more you can filter the records down, the faster it will run because of being a smaller dataset.
If this query is running 5 times every time a user loads a page, I'm worried I'll run out of resources quickly.
My first question is why you'd run this query more than once per page. If it's to cover more than one friend, the query I posted can be updated to expose counts for products on a per friend or user basis.
After that, I'd wonder if the query can be cached at all. How fresh do you really need the data to be - is 2 hours acceptable? How about 6 or 12... We'd all like the data to be instantaneous, but you need to weigh that against performance and make a decision.

Datatable Compare Rows

I have a datatable object, which is populated from a webservice.
Apparently, the web service just throws everything (data) back to me. The data which gets in my datatable looks like this:
Dept Code Value
Science ABC 5
Science ABC 6
Science DEF 7
Math ABC 3
Math DEF 9
English ABC 2
English DEF 3
English DEF 4
English DEF 5
Now, I want to create a datatable that will calculate (and sum)/ eliminate the values in the datatable, so that the new datatable would have the data like:
Dept Code Value
Science ABC 11
Science DEF 7
Math ABC 3
Math DEF 9
English ABC 2
English DEF 12
Please take note that I could only modify the datatable.
Can anyone help me? VB.Net please. Thanks.
A simple summary query will give you what you want:
SELECT Dept, Code, SUM(Value) sum_value FROM datatable GROUP BY Dept, Code
You could also create a view with that SQL definition, so you could
just query the view as you would a table. If you start to get so much
data that the query is slow, you'll want to store the results in a
permanent table - but for moderate amounts of data this should work fine.