Combining two tables in a query and creating new columns from that - sql

I'm having issues with a query that I'm not ENTIRELY sure can be done with the way the database is set up. Basically, I'll be using two different tables in my query, let's say Transactions and Ticket Prices. They look like this (With some sample data):
TRANSACTIONS
Transation ID | Ticket Quantity | Total Price | Salesperson | Ticket Price ID
5489 250 250 Jim 8765
5465 50 150 Jim 1258
7898 36 45 Ann 4774
Ticket Prices
Ticket Price ID | Quantity | Price | Bundle Name
8765 1 1 1 ticket, $1
4774 12 15 5 tickets, $10
1258 1 3 1 ticket, $3
What I'm aiming for is a report, that breaks down each salesperson's sales by bundle type. The resulting table should be something like this:
Sales Volume/Salesperson
Name | Bundle A | Bundle B | Bundle C | Total
Jim 250 0 50 300
Ann 0 36 0 36
I've been searching the web, and it seems the best way of getting it like this is using various subqueries, which works well as far as getting the column titles properly displayed, but it doesn't work as far as the actual numerical totals. It basically combines the data, giving each salesperson a total readout (In this example, both Jim and Ann would have 250 sales in Bundle A, 36 in Bundle B, etc). Is there any way I can write a query that will give me the proper results? Or even something at least close to it? Thanks for any input.

You can use the PIVOT statement in Oracle to do this. A query might look something like this:
WITH pivot_data AS (
SELECT t.salesperson,p.bundle_name,t.ticket_quantity
FROM ticket_prices p, transactions t
where t.ticket_price_id = p.ticket_price_id
)
SELECT *
FROM pivot_data
PIVOT (
sum(ticket_quantity) --<-- pivot_clause
FOR bundle_name --<-- pivot_for_clause
IN ('1 ticket, $1','5 tickets, $10', '1 ticket, $3' ) --<-- pivot_in_clause
);
which would give you results like this:

Related

SQLite: Calculating percentage of number of entries that match multiple criteria

I feel like I'm missing something super obvious here. I've read tons of threads and Googled my butt off, but I can't figure out how to get this code to work, though I've come close a couple of times.
I'm working off a table that has a few columns. I need to select the items that match two criteria and calculate the percentage of the whole that match that criteria and round the percentage using printf (I cannot use ROUND). I haven't tried using printf yet because I can't even get the percentage to calculate.
For example, a table called movies:
------------------------------------
id | title | score
-----------------------------------
1 | War Movie | 51
2 | Pony Movie | 100
3 | Big Wars | 55
4 | Bad Movie | 12
5 | Big Heist | 90
6 | Total War | 19
I want to pull all the movies that have a score > 50 and "war" in the title. In this case, it would return 33.33 ("War Movie" and "Big Wars" match the criteria; 2 / 6 = 33.33%).
Of all the things I've tried, I think this comes closest:
FROM movies
CROSS JOIN (SELECT SUM(title) AS s FROM movies) t
WHERE score > 50 AND title LIKE '%war%'
GROUP BY title
A couple of things go wrong with this:
The COUNT(*) returns a 1 in each column (instead of 6 - i.e. the total of all records in the table)
% of total returns (null)
This also feels close:
SELECT title,
SUM(title) * 1.0 / Count(title) * 100 AS Percentage
FROM movies
WHERE score > 50 AND title LIKE '%war%'
GROUP BY title
But this just returns the percentage as 0.
I'm stumped!
I want to pull all the movies that have a score > 50 and "war" in the title.
If I understand correctly, you want conditional aggregation:
select avg(case when score > 50 and title like '%war%' then 1.0 else 0 end) as ratio
from movies;

SQL How can this happen? - Query which normally returns 1 result alone actually resulted in multiple results when put inside WHERE clause

Question brief
I'm doing this practice on w3resource and I couldn't understand why the solution worked. I'm 2 days old to SQL. I'll appreciate very much if someone can help me explain.
I have 2 tables, COMPANY(com_id, com_name) and PRODUCT(pro_name, pro_price, com_id). Each company has several products with different prices. Now I need to write a query to display companies' name together with their most expensive products respectively.
The sample answer on the practice is like this
SELECT c.com_name, p.pro_name, p.pro_price
FROM product p
INNER JOIN company c ON p.com_id = c.com_id
AND p.pro_price =
( SELECT MAX(p.pro_price)
FROM product p
WHERE p.com_id = c.com_id );
The query returned expected result.
com_name pro_name pro_price
--------- --------- -----------
Samsung Monitor 5000.00
iBall DVD drive 900.00
Epsion Printer 2600.00
Zebronics ZIP drive 250.00
Asus Mother Board 3200.00
Frontech Speaker 550.00
But I cannot understand how, especially the part inside the bottom sub-query. Isn't SELECT MAX(p.pro_price) supposed to return only 1 highest price of all companies together?
I also tried subsecting this sub-query like this
SELECT MAX(p.pro_price)
FROM product p
INNER JOIN company c ON p.com_id = c.com_id
WHERE p.com_id = c.com_id;
... and it only returned 1 maximum value.
max(p.pro_price)
-----
5000.00
So how does the final result of the whole query include more than 1 records? There's no GROUP BY or anything.
By the way, the query seemed to use 2 conditions for INNER JOIN. But I also tried swapping the 2nd condition into a WHERE clause and it still worked the same. This is one more thing I don't understand.
The databases involved
COMPANY table
COM_ID | COM_NAME
----------------
11 | Samsung
12 | iBall
13 | Epsion
14 | Zebronics
15 | Asus
16 | Frontech
PRODUCT table
PRO_NAME PRO_PRICE COM_ID
-------------------- ---------- ---------
Mother Board 3200 15
Key Board 450 16
ZIP drive 250 14
Speaker 550 16
Monitor 5000 11
DVD drive 900 12
CD drive 800 12
Printer 2600 13
Refill cartridge 350 13
Mouse 250 12
The sub-query is a correlated sub-query. This query is executed for each value of c.com_id in the outer query:
WHERE p.com_id = c.com_id

Distributing Records Evenly From One Table to Another

I have 3 tables:
Users
-----
UserID (varchar)
Active (bit)
Refunds_Upload
--------------
BorrowerNumber (varchar)
Refunds
-------
BorrowerNumber
UserID
I first select all of the UserID values where Active = 1.
I need to insert the records from Refunds_Upload to Refunds but I need to insert the same (or as close as possible) number of records for each Active UserID.
For example, if Refunds_Upload has 20 records and the Users table has 5 people where Active = 1, then I would need to insert 4 records per UserID into table Refunds.
End Result would be:
BorrowerNumber UserID
105 Fred
110 Fred
111 Fred
115 Fred
120 Billy
122 Billy
123 Billy
125 Billy
130 Lucius
131 Lucius
133 Lucius
135 Lucius
138 Lucy
139 Lucy
140 Lucy
141 Lucy
142 Grady
143 Grady
144 Grady
145 Grady
Of course, it won't always come to an even number of records per User so I need to account for that as well.
First run this and check it returns something like what you want to insert, before you uncomment the insert and actually carry it out..
--INSERT INTO Refunds
SELECT
numbered_u.UserID,
numbered_ru.BorrowerNumber
FROM
(SELECT u.*, ROW_NUMBER() OVER(ORDER BY UserID) - 1 as rown, SUM(CAST(Active as INT)) OVER() as count_users FROM Users u WHERE active=1) numbered_u
INNER JOIN
(SELECT ru.*, ROW_NUMBER() OVER(ORDER BY BorrowerNumber) - 1 as rown, COUNT(*) OVER() as count_ru FROM Refund_Uploads ru) numbered_ru
ON
ROUND(CAST(numbered_ru.rown AS FLOAT) / (count_ru / count_users)) = numbered_u.rown
The logic:
We number every interesting (active=1) row in users and we also count them all. This should return us all 5 users, numbered 0 to 4 and with a ctr that is 5 on each row.
Then we join them to a similarly numbered list of Refund_Uploads (say 20). Similarly, those rows will be numbered 0 to 19 for mathematical reasons that become apparent later. We also count all these rows too
And we then join these two datasets together but the condition is a range of values rather than exact values. The logic is "refund_upload row number, divided by the_count_of_rows_there_should_be_per_user" (i.e. 0..19 / (20/5) ) = user_row_number. Hopefully thus refund rows 0 to 3, associate with user 0, refund rows 4 thru 7 associate with user 1.. etc
It's a little hard to debug without full data - I feel it might need a few +1 / -1 tweaks here and there.
I originally used FLOOR but switched to using ROUND, as I think this might work for distributing sets of numbers where there isn't a whole number of divisions in Refund/User e.g. your 240/13 example.. Hopefully some users will have 18 rows and some 19

Selecting Records with Only One Value Occurring Within a Range

Each week I send a large quantity of eggplants from my eggplant farm to my various customers. Customers usually purchase the same number of eggplants weekly, but occasionally that amount varies. Since I have over 25,000,000 customers (big farm), I want to condense their purchase information into a more manageable table for the report I'm working on. Here's what my source data looks like -
CustAcct -------------- PurchaseWeekEndDate ----- EggplantsPurchased
123 1/1/2012 50
123 1/8/2012 50
123 1/15/2012 50
123 1/22/2012 60
123 1/29/2012 50
123 2/5/2012 50
I would like the data in my new table to look like this -
CustAcct------- StartRangePWEnd Date----- EndRangePWEndDate ------EggplantsPurchased
123 1/1/2012 1/15/2012 50
123 1/22/2012 1/22/2012 60
123 1/29/2012 2/5/2012 50
Any ideas?
This is a rather hard problem. To solve it, you need to identify groups of orders that are the same. You can do this using a correlated subquery, to find the next date for each customer that has a different number of eggplants. This works as a group identifier.
Once you have that, the rest is just aggregation:
select CustAcct, min(PurchaseWeekEndDate), max(PurchaseWeekEndDate), EggplantsPurchased
from (select t.*,
(select min(PurchaseWeekEndDate)
from t t2
where t.CustAcct = t2.CustAcct and t.EggplantsPurchased <> t2.EggplantsPurchased and t2.PurchaseWeekEndDate > t.PurchaseWeekEndDate
) as nextDate
from t
) t
group by CustAcct, nextDate, EggplantsPurchased
And, since no eggplant farm in the world has 25,000,000 customers, what is the real nature of this question?

SQL summary by ID with period to period comparison

I am a beginner in SQL, hope someone can help me on this:
I have a Items Category Table:
ItemID | ItemName | ItemCategory | Active/Inactive
100 Carrot Veg Yes
101 Apple Fruit Yes
102 Beef Meat No
103 Pineapple Fruit Yes
And I have a sales table:
Date | ItemID | Sales
01/01/2010 100 50
05/01/2010 101 200
06/01/2010 101 250
06/01/2010 102 300
07/01/2010 103 50
08/01/2010 100 100
10/01/2010 102 250
How Can I achieve a sales summary table by Item By Period as below (with only active item)
ItemID | ItemName | ItemCategory | (01/01/2010 – 07/01/2010) | (08/01/2010 – 14/01/1020)
100 Carrot Veg 50 100
101 Apple Fruit 450 0
103 Pineapple Fruit 0 0
A very dirty solution
SELECT s.ItemId,
(SELECT ItemName FROM Items WHERE ItemId = s.ItemId) ItemName,
ISNULL((SELECT Sum(Sales)FROM sales
WHERE [Date] BETWEEN '2010/01/01' AND '2010/01/07'
AND itemid = s.itemid
GROUP BY ItemId),0) as firstdaterange,
ISNULL((SELECT Sum(Sales)FROM sales
WHERE [Date] BETWEEN '2010/01/08' AND '2010/01/14'
AND itemid = s.itemid
GROUP BY ItemId), 0) seconddaterange
FROM Sales s
INNER JOIN Items i ON s.ItemId = i.ItemId
WHERE i.IsActive = 'Yes'
GROUP BY s.ItemId
Again a dirty solution, also the dates are hardcoded. You can probably turn this into a stored procedure taking in the dates as parameters.
I'm not too clued up on PIVOT command but maybe that will be worth a google.
You can pivot the data using the SQL PIVOT operator. Unfortunately, that operator has limited scope due to the requirement to pre-specify the output columns.
You normally achieve this by grouping on a calculated column (in this case, one that computes the week number or first day of the week in which each row falls). You can then either generate SQL on-the-fly with columns derived using SELECT DISTINCT week FROM result, or just drop the result into Excel and use its pivot table facility.