Combining multiple related tables with different dates - sql

Suppose I have a company that each month buys a certain amount of computers, and the ratios of desktop to laptops as well as the brand changes sometimes (This is made up scenario, my actual scenario has nothing to do with computers). Ex:
For the accounting department:
On 1/1:
Laptops 50% (Dell) Desktop 50% (Acer)
For the next few months, I use that buying structure. However, I decide I want to change brands:
On 4/1:
Laptops 50% (Apple) Desktop 50% (Asus)
Then later, I decide I need more laptops:
On 7/1:
Laptops 75% (Apple) Desktop 25% (Asus)
For a certain department, I would like to get the following data in a report:
Date
Company
Weight
1/1
Dell
.5
1/1
Acer
.5
4/1
Apple
.5
4/1
Asus
.5
7/1
Apple
.75
7/1
Asus
.25
Here is what the database table structure looks like:
Table: department
id
name
1
Accounting
2
Developers
Table: computer_type
id
name
1
Laptop
2
Desktop
Table: purchase_weights
date
computer_type_id
weight
department_id
1/1
1
.5
1
1/1
2
.5
1
7/1
1
.25
1
7/1
2
.75
1
Table: purchase_companies
date
computer_type_id
company
1/1
1
Dell
1/1
2
Acer
4/1
1
Apple
4/1
2
Asus
As you can see, the tables need to be joined in a way that the dates that are missing in one table get forward filled. There is no entry in the weights table for 4/1 when the company changed, and there is no entry in the company table when the weight changed. The constraints can also be seen, where if you change the company for laptops, all departments will start to buy that company if their weights include that computer type. Any insight would be very helpful, thanks!
Things I have tried:
Full outer join (does not include missing dates)
SELECT * FROM purchase_weights pw FULL OUTER JOIN purchase_companies pc ON pw.computer_type_id = pc.computer_type_id AND pw.date = pc.date WHERE pw.department_id = 1
Normal join (does a cross multiplication)
SELECT * FROM purchase_weights pw JOIN purchase_companies pc ON pw.computer_type_id = pc.computer_type_id WHERE pw.department_id = 1

Related

Create fair portfolios from the massvie of elements in Jupyter/pandas

Good afternoon!
I have a list of sellers in the following form:
Seller 1 - turnover $90
Seller 2 - turnover $110
Seller 3 - turnover $80
Seller 4 - turnover $120
...
I need to make the algorithm distributes them among supervisors so that each of them gets 2 sellers with equal turnover.
That is, the first supervisor will get the seller 1 and 2, the second - 3 and 4. What algorithm can be used here? Are there any functions that allow you to distribute sellers from a dataframe to Pandas in this way?

Query to aggregate across multiple tables

I'm new to SQL, and I'm trying to create a database to manage a small inventory. This is the structure of the db:
DatabaseStructure
I need to create a query that returns the total inventory per material. So, the first step would be to look up for all the batches associated with the material. Second, look up for all the movements associated with each batch. Then, sum the quantity associated with each movement, but depending on the movement type (If it is a good receipt is addition (+), but if it is an inventory withdrawal is subtraction (-)).
Here is an example of the tables with sample data and the desired result.
Table Material
MaterialID
MaterialDescription
1
Bottle
2
Box
Table Batch
BatchID
MaterialID
VendorMaterial
VendorBatch
ExpirationDate
1000
1
2096027
00123456
12/12/2025
1001
1
2096027
00987654
11/11/2026
1002
2
102400
202400E
10/10/2023
Table Movement
MovementID
BatchID
MovementType
Quantity
CreatedBy
CreatedOnDate
1
1000
Good receipt
100
user1#email.com
4/10/2022
2
1000
Inventory withdrawal
20
user2#email.com
4/15/2022
3
1000
Inventory withdrawal
25
user3#email.com
4/17/2022
4
1001
Good receipt
100
user1#email.com
4/20/2022
5
1001
Inventory withdrawal
10
user4#email.com
4/26/2022
6
1002
Good receipt
50
user1#email.com
2/26/2022
Expected query result - total inventory per material:
MaterialDescription
TotalInventory
Bottle
145
Box
50
TotalInventory calculation: for Bottle there are two good receipts movements of 100 and three withdrawals of 20, 25 and 10. So, total inventory will be (100+100)-(20+25+10)=145.
Thanks for your help!
select
mat.MaterialDescription,
sum(
case mov.MovementType
when 'Good receipt' then 1
when 'Inventory withdrawal' then -1
else 0 /* don't know what to do for other MovementTypes */
end * mov.Quantity
) as TotalInventory
from
Material as mat
left join Batch as bat on bat.MaterialID = mat.MaterialID
left join Movement as mov on mov.BatchID = bat.BatchID
group by
mat.MaterialDescription
;

Calculating Proportions for Baseball-Related Query

Here are my two tables:
BIO - contains player biographical information with the following columns
i. PLAYER_ID
ii. PLAYER_NAME
iii. DATE_OF_BIRTH
iv. TEAM_NAME
PITCHES - contains batter and pitcher statistics by pitch with the following columns
i. GAME_DATE (formatted YYYY-MM-DD, e.g. 2016-01-01)
ii. BATTER_PLAYER_ID
iii. PITCHER_PLAYER_ID
iv. PITCHER_THROW_SIDE (L/R)
v. BATTER_HAND (L/R)
vi. PITCH_TYPE (Changeup, Curveball, Cutter, 4-seam fastball, Knuckleball, 2-Seam
Fastball, Slider, Splitter)
vii. PITCH_CALL (Ball, CatcherInterference, FoulBall, HitByPitch, InPlay, StrikeCalled,
StrikeSwinging)
viii. IN_ZONE (YES/NO)
I want a query that returns the names of players with an in-zone or out-of-zone swinging strike rate of greater than 15% for fastballs for the 2016-2017 seasons, combined. I also want team name, pitcher handedness, and to include cutters and sinkers as fastballs.
Here is what I have so far:
SELECT b.PLAYER_NAME, b.TEAM_NAME, p.PITCHER_THROW_SIDE
FROM BIO AS b INNER JOIN PITCHES AS p
ON b.PLAYER_ID = p.PITCHER_PLAYER_ID
WHERE p.PITCH_TYPE = '4-seam fastball' OR p.PITCH_TYPE = '2-Seam' OR p.PITCH_TYPE = 'Cutter'
AND p.GAME_DATE BETWEEN 2016-01-01 AND 2017-12-31
GROUP BY b.PLAYER_ID
HAVING (Count(IN_ZONE)) ....
I think this is the right idea... but I'm a bit lost now as to how I can include the 15% in-zone/out-of-zone rates.
Thank you for any help.
How do you calculate in-zone or out-of-zone swinging strike rate of greater than 15%?
if NR_YES/total
HAVING sum(case when IN_ZONE='YES' then 1 else 0)/count(*)>0.15
If Nr_Yes/Nr_NO
HAVING sum(case when IN_ZONE='YES' then 1 else 0)/count(*)>3/17
Note that 3/17 is 15/85. Since there is only the possibility of yes and no
Also note that with the sum of 0s and 1s I am actually simulating a
count(*) where IN_ZONE='YES'

SQL First Product and Second Product Combinations

I'm trying to find what the preferred combination of item purchases for customers who have shopped here. Currently, this code tells me the number of combinations, but this a) does not tell me which product was purchased first and second and b) tells me all the possible combinations of customers' varying frequency.
The data I currently have looks something like this:
CustomerKey CalendarDate PnLCategory ChannelName
8 2014-06-27 Laptop Online
8 2015-07-01 Mouse Retail
8 2015-12-13 Earphones Online
10 2014-01-10 Headphones Retail
14 2016-01-25 Laptop Online
14 2017-02-18 Mouse Retail
Based on this data, you can find that customers typically purchase a laptop then mouse. Additionally, you can tell that customers typically purchase online than retail.
I only care about the first two transactions a customer makes. Also, how would you include which channel the product was purchased from? Ideally, would like to be able to know what second product a customer would likely purchase given first product and in which channel.
SELECT A.PnLCategory, B.PnLCategory, COUNT (*) CountForCombination
FROM MyTable3 A
INNER JOIN MyTable3 B
ON A.CustomerKey = B.CustomerKey
AND A.PnLCategory < B.PnLCategory
GROUP BY A.PnLCategory, B.PnLCategory
ORDER BY CountForCombination desc
Successful result would look something like:
FirstProduct ChannelName1 SecondProduct ChannelName2 #Occurences
Laptop Online Mouse Retail 100
Mouse Retail Headphones Online 50

Storing a set of criteria in another table

I have a large table with sales data, useful data below:
RowID Date Customer Salesperson Product_Type Manufacturer Quantity Value
1 01-06-2004 James Ian Taps Tap Ltd 200 £850
2 02-06-2004 Apple Fran Hats Hats Inc 30 £350
3 04-06-2004 James Lawrence Pencils ABC Ltd 2000 £980
...
Many rows later...
...
185352 03-09-2012 Apple Ian Washers Tap Ltd 600 £80
I need to calculate a large set of targets from table containing values different types, target table is under my control and so far is like:
TargetID Year Month Salesperson Target_Type Quantity
1 2012 7 Ian 1 6000
2 2012 8 James 2 2000
3 2012 9 Ian 2 6500
At present I am working out target types using a view of the first table which has a lot of extra columns:
SELECT YEAR(Date)
, MONTH(Date)
, Salesperson
, Quantity
, CASE WHEN Manufacturer IN ('Tap Ltd','Hats Inc') AND Product_Type = 'Hats' THEN True ELSE False END AS IsType1
, CASE WHEN Manufacturer = 'Hats Inc' AND Product_Type IN ('Hats','Coats') THEN True ELSE False END AS IsType2
...
...
, CASE WHEN Manufacturer IN ('Tap Ltd','Hats Inc') AND Product_Type = 'Hats' THEN True ELSE False END AS IsType24
, CASE WHEN Manufacturer IN ('Tap Ltd','Hats Inc') AND Product_Type = 'Hats' THEN True ELSE False END AS IsType25
FROM SalesTable
WHERE [some stuff here]
This is horrible to read/debug and I hate it!!
I've tried a few different ways of simplifying this but have been unable to get it to work.
The closest I have come is to have a third table holding the definition of the types with the values for each field and the type number, this can be joined to the tables to give me the full values but I can't work out a way to cope with multiple values for each field.
Finally the question:
Is there a standard way this can be done or an easier/neater method other than one column for each type of target?
I know this is a complex problem so if anything is unclear please let me know.
Edit - What I need to get:
At the very end of the process I need to have targets displayed with actual sales:
Type Year Month Salesperson TargetQty ActualQty
2 2012 8 James 2000 2809
2 2012 9 Ian 6500 6251
Each row of the sales table could potentially satisfy 8 of the types.
Some more points:
I have 5 different columns that need to be defined against the targets (or set to NULL to include any value)
I have between 30 and 40 different types that need to be defined, several of the columns could contain as many as 10 different values
For point 2, if I am using a row for each permutation of values, 2 columns with 10 values each would give me 100 rows for each sales person for each month which is a lot but if this is the only way to define multiple values I will have to do this.
Sorry if this makes no sense!
If I am correct that the "Target_Type" field in the Target Table is based on the Manufacturer and the Product_Type, then you can create a TargetType table that looks like what's below and JOIN on Manufacturer and the Product_Type to get your Target_Type_Value:
ID Product_Type Manufacturer Target_Type_Value
1 Taps Tap Ltd 1
2 Hats Hats Inc 2
3 Coats Hats Inc 2
4 Hats Caps Inc 3
5 Pencils ABC Ltd 6
This should address the "multiple values for each field" problem by having a row for each possibility.