Complex multi level hierarchical SQL - sql

How can I achieve the below results using a query in SQL Server.
Table: shares_info
Complex multilevel hierarchy:
comp_name investee
APPLE MS
APPLE INTEL
APPLE MRF
APPLE GOOG
MS GOOG
MS MRF
MRF STF
MRF ABC
GOOG INTEL
GOOG TRF
GOOG XYZ
The idea is something like this. APPLE has invested in MS,INTEL,MRF,GOOG. And so on. Now the below input is something like sell my shares but first sell off shares without dependencies first. That is what my output conveys. If I want to sell GOOG shares then based on my below input GOOG has dependency on INTEL/TRF/XYZ and hence before selling GOOG I need to sell (123, XYZ) and (456 INTEL). Next, if I want to sell APPLE it has dependency on MS/INTEL/MRF/GOOG and hence as per below input I need to first sell INTEL/MRF/GOOG to sell off APPLE.
Table: shares_sell_info
Some input
id comp_name
123 APPLE
456 APPLE
123 XYZ
789 GOOG
456 INTEL
243 MRF
432 ABC
The ordering should be like below
123 XYZ (XYZ does not have any dependency and hence should come at the top)
432 ABC (MRF has a dependency on ABC and hence ABC comes on top)
243 MRF (MRF’s dependency is all taken care and hence we have MRF)
456 INTEL (APPLE and GOOGLE has a dependency on INTEL and hence INTEL is on top)
789 GOOG (At this point we can add GOOG because all its dependents are already at top)
123 APPLE (APPLE has a dependency on GOOG and hence GOOG come before APPLE)
456 APPLE
In the above ordering one among XYZ/ABC could have been first and it does not matter because they both do not have any dependency

dbfiddle
WITH
cte_com as (SELECT * FROM (VALUES
(123 ,'APPLE'),
(456 ,'APPLE'),
(123 ,'XYZ'),
(789 ,'GOOG'),
(456 ,'INTEL'),
(243 ,'MRF'),
(432 ,'ABC')) as cte_com(id, comp))
,cte_temp as (SELECT * FROM (VALUES
('APPLE', 'MS'),
('APPLE', 'INTEL' ),
('APPLE', 'MRF' ),
('APPLE', 'GOOG' ),
('MS', 'GOOG' ),
('MS', 'MRF' ),
('MRF', 'STF' ),
('MRF', 'ABC' ),
('GOOG', 'INTEL' ),
('GOOG', 'TRF' ),
('GOOG', 'XYZ')) as cte_temp(one, two))
SELECT id, comp , one
, count(*) as count
from cte_com
left join cte_temp on cte_temp.one=cte_com.comp
group by id, comp, one
order by count(*)
But it's unclear why this solution gives the ordering you want.
What is the difference between 'XYZ' and 'ABC'?
They are both depending on 1 other comp.
output:
id comp one count
123 XYZ 1
432 ABC 1
456 INTEL 1
243 MRF MRF 2
789 GOOG GOOG 3
123 APPLE APPLE 4
456 APPLE APPLE 4
7 rows

I think #Luuk's idea is right with some slight modifications. Here is the query which worked for me.
select * from shares_sell_info as ssi
left join (
select comp_name, count(*) as count
from shares_info si
group by comp_name
UNION
select comp_name, 0 as count
from shares_info
where investee is null
) temp on temp.comp_name = share_info.comp_name
where id in (
)
order by count

Here is the actual answer for my problem that I got from another post.
https://stackoverflow.com/questions/60420380/assign-weight-based-on-hierarchical-depth

Related

Postgres rank() without duplicates

I'm ranking race data for series of cycling events. Racers win various amounts of points for their position in races. I want to retain the discrete event scoring, but also rank the racer in the series. For example, considering a sub-query that returns this:
License #
Rider Name
Total Points
Race Points
Race ID
123
Joe
25
5
567
123
Joe
25
12
234
123
Joe
25
8
987
456
Ahmed
20
12
567
456
Ahmed
20
8
234
You can see Joe has 25 points, as he won 5, 12, and 8 points in three races. Ahmed has 20 points, as he won 12 and 8 points in two races.
Now for the ranking, what I'd like is:
Place
License #
Rider Name
Total Points
Race Points
Race ID
1
123
Joe
25
5
567
1
123
Joe
25
12
234
1
123
Joe
25
8
987
2
456
Ahmed
20
12
567
2
456
Ahmed
20
8
234
But if I use rank() and order by "Total Points", I get:
Place
License #
Rider Name
Total Points
Race Points
Race ID
1
123
Joe
25
5
567
1
123
Joe
25
12
234
1
123
Joe
25
8
987
4
456
Ahmed
20
12
567
4
456
Ahmed
20
8
234
Which makes sense, since there are three "ties" at 25 points.
dense_rank() solves this problem, but if there are legitimate ties across different racers, I want there to be gaps in the rank (e.g if Joe and Ahmed both had 25 points, the next racer would be in third place, not second).
The easiest way to solve this I think would be to issue two queries, one with the "duplicate" racers eliminated, and then a second one where I can retain the individual race data, which I need for the points break down display.
I can also probably, given enough effort, think of a way to do this in a single query, but I'm wondering if I'm not just missing something really obvious that could accomplish this in a single, relatively simple query.
Any suggestions?
You have to break this into steps to get what you want, but that can be done in a single query with common table expressions:
with riders as ( -- get individual riders
select distinct license, rider, total_points
from racists
), places as ( -- calculate non-dense rankings
select license, rider, rank() over (order by total_points desc) as place
from riders
)
select p.place, r.* -- join rankings into main table
from places p
join racists r on (r.license, r.rider) = (p.license, p.rider);
db<>fiddle here

How to turn values of a column into new individual columns in SQL

Hello everyone I am trying to convert a categorical variable which is a column named Educational Group and has values like
State | Educational Group | No of Persons |
-------+-----------------------+---------------+
A Below Metric 123
A metric/secondary 456
A diploma 789
A graduate and above 101112
A post graduate 131415
B Below Metric 145
B metric/secondary 467
B diploma 564
B graduate and above 987
B post graduate 875
I want this to be converted as
State | Below Metric_ NO of persons | Metric/Secondary_No of persons | Diploma_No of Persons| ...
-------+-------------------------------+--------------------------------+---------------------+
A 123 456 789
B 145 467 564
and so on for all states and all educational levels.
Is it possible to do in SQL? Actually I did the same in Python using pivot function and it worked pretty well and now I the same to be done in Microsoft SQL Server Management Studio.
I want to convert this
https://ibb.co/L15m2sS
into this https://ibb.co/9tLpk7V
As mentioned PIVOT should do the trick.
SELECT *
FROM
(
SELECT *
FROM mytable
) AS SourceTable PIVOT(AVG([No_of_Persons]) FOR [Educational_Group] IN([Below Metric],
[metric/secondary],
[graduate and above],
[post graduate])) AS PivotTable;
Online demonstration using your table on db<>iddle.

Generate rows where none exist

I'm a little stumped on how to generate rows when none exist for specified conditions. Apologies for the formatting since I don't know how to write tables in SO posts, but let's say I have data that looks like this:
TimePeriodID CityspanSiteKey Mean_Name Mean
2 123 Social Environment 4
2 123 Youth with Adults 3.666666746
2 123 Youth with Peers 3.5
4 123 Social Environment 2.75
4 123 Youth with Adults 2.555555582
4 123 Youth with Peers 3.5
There are a few other Mean_Name values which I would like to include in every single time period ID, but just a Mean value of NULL, like the following:
TimePeriodID CityspanSiteKey Mean_Name Mean
2 123 Social Environment 4
2 123 Youth with Adults 3.666666746
2 123 Youth with Peers 3.5
2 123 Staff Build Relationships and Support Individual Youth NULL
2 123 Staff Positively Guide Behavior NULL
4 123 Social Environment 2.75
4 123 Youth with Adults 2.555555582
4 123 Youth with Peers 3.5
4 123 Staff Build Relationships and Support Individual Youth NULL
4 123 Staff Positively Guide Behavior NULL
5 123 Social Environment 2.75
5 123 Youth with Adults 2.555555582
5 123 Youth with Peers 3.5
5 123 Staff Build Relationships and Support Individual Youth NULL
5 123 Staff Positively Guide Behavior NULL
6 123 Social Environment NULL
6 123 Youth with Adults NULL
6 123 Youth with Peers NULL
6 123 Staff Build Relationships and Support Individual Youth NULL
6 123 Staff Positively Guide Behavior NULL
What's the best way to go about doing this? I don't think CASEing will be of much use since these records don't exist.
You seem to want a cross join and then left join. Not all values are in your original data, so you might as well construct them:
select ti.timeperiod, c.CityspanSiteKey, m.mean_name, t.mean
from (values (2), (4), (5), (6)
) ti(timeperiod) cross join
(values (123)
) c(CityspanSiteKey) cross join
(values ('Social Environment'), ('Youth with Adults'), ('Youth with Peers'), ('Staff Build Relationships and Support Individual Youth'), ('Staff Positively Guide Behavior')
) m(mean_name) left join
t
on t.timeperiod = ti.timeperiod and
t.CityspanSiteKey = c.CityspanSiteKey and
t.mean_name = m.mean_name;
You can use subqueries or existing tables instead of the values() clause.

Order By Sums, while not grouping like rows

Suppose I have data that looks like this as the result of a query
SKU | STOCK | SNACK | FLAVOR
1234 45 Chips BBQ
1236 87 Chips BBQ
2345 12 Pretzel Bacon
3456 51 Chips Ranch
4567 32 Pretzel Classic
5678 142 Candy Chocolate
... ... ... ...
Is it possible to have SQL in an ORDER BY line that allows me to display the above data first sorted by whatever Snack (Chips, Pretzel, Candy, etc.) has the largest SUM(Stock) and then by Stock DESC while not merging any of the entries? I briefly tried to use a line similar to
ORDER BY
SUM(Snack) DESC,
SUM(Flavor) DESC,
Stock DESC
but could not determine how the GROUP BY statement should be laid out.
You can use DSum to compute total STOCK for each SNACK without a GROUP BY. And use that Dsum in the ORDER BY. I also needed to use Val() on the DSum values to make it sort correctly.
SELECT y.SKU, y.STOCK, y.SNACK, y.FLAVOR
FROM YourTable AS y
ORDER BY
Val(DSum("[STOCK]", "YourTable", "[SNACK]='" & y.SNACK & "'")) DESC,
y.STOCK DESC;
Be aware that DSum is Access-specific so this is not suitable if you want a query which can be ported to another database.
Try with following SQL
select sku, stock, data.snack, flavor, summ = summ.summ
from data
join (select snack, summ = sum(stock) from data group by snack) as summ
on summ.snack = data.snack
order by summ desc, stock desc
SKU|STOCK|SNACK|FLAVOR|SUMM
1236 87 Chips BBQ 183
3456 51 Chips Ranch 183
1234 45 Chips BBQ 183
5678 142 Candy Chocolate 142
4567 32 Pretzel Classic 44
2345 12 Pretzel Bacon 44

MS Access, Excel, SQL, and New Tables

I'm just starting out with MS Access 2010 and have the following setup. 3 excel files: masterlist.x (which contains every product that I sell), vender1.x (which contains all products from vender1, I only sell some of these products), and vender2.x (again, contains all products from vender2, I only sell some of these products). Here's an example data collection:
masterlist.x
ID NAME PRICE
23 bananas .50
33 apples .75
35 nuts .87
38 raisins .25
vender1.x
ID NAME PRICE
23 bananas .50
25 pears .88
vender2.x
ID NAME PRICE
33 apples .75
35 nuts .87
38 raisins .25
49 kiwis .88
The vender lists get periodically updated with new items for sell and new prices. For example, vender1 raises the price on bananas to $.75, my masterlist.x would need to be updated to reflect this.
Where I'm at now: I know how to import the 3 excel charts into Access. From there, I've been researching if I need to setup relationships, create a macro, or a SQL query to accomplish my goals. Not necessarily looking for a solution, but to be pointed in the right direction would be great!
Also, once the masterlist.x table is updated, what feature would I use to see which line items were affected?
Update: discovered SQL /JOIN/ and have the following:
SELECT * FROM master
LEFT JOIN vender1
ON master.ID = vender1.ID
where master.PRICE <> vender1.PRICE;
This gives me the output (for the above scenario)
ID NAME PRICE ID NAME PRICE
23 bananas .50 23 bananas .75
What feature would instead give me:
masterlist.x
ID NAME PRICE
23 bananas .75
33 apples .75
35 nuts .87
38 raisins .25
Here is a heads up since you were asking for ideas to design. I don't really fancy your current table schema. The following queries are built in SQL Server 2008, the nearest syntax that I could get in sqlfiddle to MS Access SQL.
Please take a look:
SQLFIDDLE DEMO
Proposed table design:
vendor table:
VID VNAME
1 smp farms
2 coles
3 cold str
4 Anvil NSW
product table:
PID VID PNAME PPRICE
203 2 bananas 0.5
205 2 pears 0.88
301 3 bananas 0.78
303 3 apples 0.75
305 3 nuts 0.87
308 3 raisins 0.25
409 4 kiwis 0.88
masterlist:
ID PID MPRICE
1 203 0.5
2 303 0.75
3 305 0.87
4 308 0.25
Join queries can easily update your masterlist now. for e.g.:
When the vendor updates their prices for the fruits they provide you. Or when they stop supply on that product. You may use where clauses to add the conditions to the query as you desire.
Query:
SELECT m.id, p.vid, p.pname, p.pprice
FROM masterlist m
LEFT JOIN product p ON p.pid = m.pid
;
Results:
ID VID PNAME PPRICE
1 2 bananas 0.5
2 3 apples 0.75
3 3 nuts 0.87
4 3 raisins 0.25
Please comment. Happy to help you if have any doubts.