Oracle - select statement to rollup multiple tables within a time frame - sql

I have 3 Oracle tables for a project that link a demo Transaction table to Transaction_Customer and Transaction_Employee as shown below. Each transaction can have multiple customers involved and many employees involved.
I am trying to write a SQL query which will list each Customer_ID that has had transactions with multiple employees within a one period. I would like the output to include a single row for each Customer_ID with a comma separated list of which Employee_IDs had a transaction with that customer.
The output should look like this:
Customer_ID|Employees
601|007,008,009
The basic query to join the tables together looks like this:
select * from transactions t
left join transactions_customer tc
on t.t_id = tc.t_id
left join transactions_employee te
on t.t_id = te.t_id
How do I get this do I finish this assignment and get the query working the way intended?
Thank you!
Transactions
T_ID|Date|Amount
1|1/10/2017|100
2|1/10/2017|200
3|1/31/2017|150
4|2/16/2017|175
5|2/17/2017|175
6|2/18/2017|185
Transactions_Customer
T_ID|Customer_ID
1|600
1|601
1|602
2|605
3|606
4|601
5|607
6|607
Transactions_Employee
T_ID|Employee_ID
1|007
1|008
2|009
3|008
4|009
5|007
6|007

Is this what you want?
select tc.Customer_id,
listagg(te.employee_id, ',') within group (order by te.employee_id) as employees
from Transactions_Customer tc join
Transactions_Employee te
on tc.t_id = te.t_id
group by tc.Customer_id;
You only need the Transactions table for filtering on the date. Your question alludes to such filtering but does not exactly describe it, so I left it out.
Edit:
The customer data (and perhaps the employees data too) has duplicates. To avoid these in the output:
select tc.Customer_id,
listagg(te.employee_id, ',') within group (order by te.employee_id) as employees
from (select distinct tc.t_id, tc.customer_id
from Transactions_Customer tc
) tc join
(select distinct te.t_id, te.employee_id
from Transactions_Employee te
) te
on tc.t_id = te.t_id
group by tc.Customer_id;

Related

Find the average from one table and compare it to another

All i want to do is to join two tables, list ALL the rows from the first table, find the average from the second table from all the rows, then list only the ones that are greater than the average.
This is wahat i have done so far, and i am only getting one greater than the average but there are others.
SELECT winner_age, AVG(actor_age) FROM oscar_winners
INNER JOIN actors ON actors.id = oscar_winners.id
WHERE winner_age > (
SELECT AVG(actor_age)
)
You don't really need a join here:
SELECT o.WINNER_AGE
FROM OSCAR_WINNERS o
WHERE o.WINNER_AGE > (SELECT AVG(a.ACTOR_AGE)
FROM ACTORS a)
Something like this?
SELECT actors.*, (SELECT AVG(actor_age) from actors) as average
FROM oscar_winners
INNER JOIN actors ON actors.id = oscar_winners.id and actors.winner_age > (SELECT AVG(actor_age) from actors)
The problem with your query is because you are using a where clause, while you should probably be using having:
SELECT w.winner_age, AVG(a.actor_age)
FROM oscar_winners w
INNER JOIN actors a
ON actors.id = oscar_winners.id
group by w.winner_age
having w.winner_age > AVG(a.actor_age)

SQL Server 2016 Sub Query Guidance

I am currently working on an assignment for my SQL class and I am stuck. I'm not looking for full code to answer the question, just a little nudge in the right direction. If you do provide full code would you mind a small explanation as to why you did it that way (so I can actually learn something.)
Here is the question:
Write a SELECT statement that returns three columns: EmailAddress, ShipmentId, and the order total for each Client. To do this, you can group the result set by the EmailAddress and ShipmentId columns. In addition, you must calculate the order total from the columns in the ShipItems table.
Write a second SELECT statement that uses the first SELECT statement in its FROM clause. The main query should return two columns: the Client’s email address and the largest order for that Client. To do this, you can group the result set by the EmailAddress column.
I am confused on how to pull in the EmailAddress column from the Clients table, as in order to join it I have to bring in other tables that aren't being used. I am assuming there is an easier way to do this using sub Queries as that is what we are working on at the time.
Think of SQL as working with sets of data as opposed to just tables. Tables are merely a set of data. So when you view data this way you immediately see that the query below returns a set of data consisting of the entirety of another set, being a table:
SELECT * FROM MyTable1
Now, if you were to only get the first two columns from MyTable1 you would return a different set that consisted only of columns 1 and 2:
SELECT col1, col2 FROM MyTable1
Now you can treat this second set, a subset of data as a "table" as well and query it like this:
SELECT
*
FROM (
SELECT
col1,
col2
FROM
MyTable1
)
This will return all the columns from the two columns provided in the inner set.
So, your inner query, which I won't write for you since you appear to be a student, and that wouldn't be right for me to give you the entire answer, would be a query consisting of a GROUP BY clause and a SUM of the order value field. But the key thing you need to understand is this set thinking: you can just wrap the ENTIRE query inside brackets and treat it as a table the way I have done above. Hopefully this helps.
You need a subquery, like this:
select emailaddress, max(OrderTotal) as MaxOrder
from
( -- Open the subquery
select Cl.emailaddress,
Sh.ShipmentID,
sum(SI.Value) as OrderTotal -- Use the line item value column in here
from Client Cl -- First table
inner join Shipments Sh -- Join the shipments
on Sh.ClientID = Cl.ClientID
inner join ShipItem SI -- Now the items
on SI.ShipmentID = Sh.ShipmentID
group by C1.emailaddress, Sh.ShipmentID -- here's your grouping for the sum() aggregation
) -- Close subquery
group by emailaddress -- group for the max()
For the first query you can join the Clients to Shipments (on ClientId).
And Shipments to the ShipItems table (on ShipmentId).
Then group the results, and count or sum the total you need.
Using aliases for the tables is usefull, certainly when you select fields from the joined tables that have the same column name.
select
c.EmailAddress,
i.ShipmentId,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
order by i.ShipmentId, c.EmailAddress;
Using that grouped query in a subquery, you can get the Maximum total per EmailAddress.
select EmailAddress,
-- max(TotalShipItems) as MaxTotalShipItems,
max(TotalPriceDiscounted) as MaxTotalPriceDiscounted
from (
select
c.EmailAddress,
-- i.ShipmentId,
-- count(*) as TotalShipItems,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
) q
group by EmailAddress
order by EmailAddress
Note that an ORDER BY is mostly meaningless inside a subquery if you don't use TOP.

Aggregate query across two tables in SQL?

I'm working in BigQuery. I've got two tables:
TABLE: orgs
code: STRING
group: STRING
TABLE: org_employees
code: STRING
employee_count: INTEGER
The code in each table is effectively a foreign key. I want to get all unique groups, with a count of the orgs in them, and (this is the tricky bit) a count of how many of of those orgs only have a single employee. Data that looks like this:
group,orgs,single_handed_orgs
00Q,23,12
00K,15,7
I know how to do the first bit, get the unique groups and count of associated orgs from the orgs table:
SELECT
count(code), group
FROM
[orgs]
GROUP BY group
And, I know how to get the count of single-handed orgs from the practice table:
SELECT
code,
(employee_count==1) AS is_single_handed
FROM
[org_employees]
But I'm not sure how to glue them together. Can anyone help?
for BigQuery: legacy SQL
SELECT
[group],
COUNT(o.code) as orgs,
SUM(employee_count = 1) as single_handed_orgs
FROM [orgs] AS o
LEFT JOIN [org_employees] AS e
ON e.code = o.code
GROUP BY [group]
using LEFT JOIN in case if some codes are missing in org_employees tables
for BigQuery: standard SQL
SELECT
grp,
COUNT(o.code) AS orgs ,
SUM(CASE employee_count WHEN 1 THEN 1 ELSE 0 END) AS single_handed_orgs
FROM orgs AS o
LEFT JOIN org_employees AS e
ON e.code = o.code
GROUP BY grp
Note use of grp vs group - looks like standard sql does like use of Reserved Keywords even if i put backticks around
Confirmed:
you can use keyword with backticks around
You could join the two tables to get the groups that have just one employee. Then you wrap this in a sub query and you count the groups that you have.
I'm using a COUNT DISTINCT and GROUP BY because I don't know how your data is structured. Is there only a single line per group or multiple?
SELECT
COUNT(DISTINCT group)
FROM (
SELECT
group
FROM
orgs AS o INNER JOIN org_employees AS e ON o.code = e.code
WHERE
employee_count = 1
GROUP BY
group
)

SQL query from an existing list

Sql newb so im trying to figure this out:
I pull a list of customer names that purchase a certain item:
select r.name
from records r
where r.item_purchased='apple'
Now I want to take that list of customers and pull the records for everything they have purchased, but I can't get past the errors. I've tried things like:
with customer_list as
(above)
select r.*
from records r
where r.name=customer_list
If you do want to use a CTE, this should work:
with customer_list as
(
select r.name
from records r
where r.item_purchased='apple'
)
select r.*
from records r
where r.name in (select name from customer_list)
The difference between this and a JOIN (e.g. Michael's solution) is the join will produce a different result if the same customer purchased an apple more than once.
I believe a self join should solve your problem:
select distinct r2.*
from records r
join records r2
on r2.name = r.name
where r.item_purchased='apple'
EDIT: Added a DISTINCT based on #a_horse_with_no_name's insight into the difference between the results, because I doubt the duplication caused by the self join would be the desired result.
Select * from records where name in (
select name
from records
where item_purchased='apple'
)

Update using Distinct SUM

I have found a few good resources that show I should be able to merge a select query with an update, but I just can't get my head around of the correct formatting.
I have a select statement that is getting info for me, and I want to pretty much use those results to Update an account table that matches the accountID in the select query.
Here is the select statement:
SELECT DISTINCT SUM(b.workers)*tt.mealTax as MealCost,b.townID,b.accountID
FROM buildings AS b
INNER JOIN town_tax AS tt ON tt.townID = b.townID
GROUP BY b.townID,b.accountID
So in short I want the above query to be merged with:
UPDATE accounts AS a
SET a.wealth = a.wealth - MealCost
Where MealCost is the result from the select query. I am sure there is a way to put this into one, I just haven't quite been able to connect the dots to get it to run consistently without separating into two queries.
First, you don't need the distinct when you have a group by.
Second, how do you intend to link the two results? The SELECT query is returning multiple rows per account (one for each town). Presumably, the accounts table has only one row. Let's say that you wanted the average MealCost for the update.
The select query to get this is:
SELECT accountID, avg(MealCost) as avg_Mealcost
FROM (SELECT SUM(b.workers)*tt.mealTax as MealCost, b.townID, b.accountID
FROM buildings AS b INNER JOIN
town_tax AS tt
ON tt.townID = b.townID
GROUP BY b.townID,b.accountID
) a
GROUP BY accountID
Now, to put this into an update, you can use syntax like the following:
UPDATE accounts
set accounts.wealth = accounts.wealth + asum.avg_mealcost
from (SELECT accountID, avg(MealCost) as avg_Mealcost
FROM (SELECT SUM(b.workers)*tt.mealTax as MealCost, b.townID, b.accountID
FROM buildings AS b INNER JOIN
town_tax AS tt
ON tt.townID = b.townID
GROUP BY b.townID,b.accountID
) a
GROUP BY accountID
) asum
where accounts.accountid = asum.accountid
This uses SQL Server syntax, which I believe is the same as for Oracle and most other databases. Mysql puts the "from" clause before the "set" and allows an alias on "update accounts".