How to insert line break in between ranked items in SQL? - sql

I would like to ask how to insert a line break in sql
For example the image below I have a few items which I ranked by the highest price to the lowest price. I would like to insert a line break in between the various groups of items.
My query is as such
with
item_list as (
Select
item_name, price
from table_A
where month = 3 and year = 2020
order by price desc
)
SELECT *
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY a.item_name
ORDER BY price DESC) as rank, *
From item_list a)
WHERE rank <= 5
ORDER BY item_name

Related

BigQuery SQL: Sum of first N related items

I would like to know the sum of a value in the first n items in a related table. For example, I want to get the sum of a companies first 6 invoices (the invoices can be sorted by ID asc)
Current SQL:
SELECT invoices.company_id, SUM(invoices.amount)
FROM invoices
JOIN companies on invoices.company_id = companies.id
GROUP BY invoices.company_id
This seems simple but I can't wrap my head around it.
Consider also below approach
select company_id, (
select sum(amount)
from t.amounts amount
) as top_six_invoices_amount
from (
select invoices.company_id,
array_agg(invoices.amount order by invoices.invoice_id limit 6) amounts
from your_table invoices
group by invoices.company_id
) t
You can create order row numbers to the lines in a partition based on invoice id and filter to it, something like this:
with array_table as (
select 'a' field, * from unnest([3, 2, 1 ,4, 6, 3]) id
union all
select 'b' field, * from unnest([1, 2, 1, 7]) id
)
select field, sum(id) from (
select field, id, row_number() over (partition by a.field order by id desc) rownum
from array_table a
)
where rownum < 3
group by field
More examples for analytical examples here:
https://medium.com/#aliz_ai/analytic-functions-in-google-bigquery-part-1-basics-745d97958fe2
https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts

Ranking and obtaining data across moving window

I have following table -
create table iphone_defects(
product string
,defect string
,qty int64
,fwkyr int64
,fwenddate date
);
insert into iphone_defects values ('iPhone','Glass breakage',100,202112,'2020-09-20');
insert into iphone_defects values ('iPhone','No sound',30,202111,'2020-09-30');
insert into iphone_defects values ('iPhone','Glass breakage',25,202110,'2020-09-06');
insert into iphone_defects values ('iPhone','Audio problem',20,202109,'2020-08-30');
insert into iphone_defects values ('iPhone','No sound',60,202108,'2020-08-23');
insert into iphone_defects values ('iPhone','Empty boxes',30,202107,'2020-08-16');
insert into iphone_defects values ('iPhone','Audio problem',25,202106,'2020-08-09');
Am expecting the following result -
fwkyr refers to Financial Week in a year. I have added in additional column fwenddate basically referring to max date in the financial week of the year.
Basically the ask is to obtain the defect with largest quantity in a 4 week window from the current week. Say for the fwkyr - 202112, the highest defects is for 'Glass breakage' and the total quantity is 100.
This is a static window. My actual use case needs 52 week.
Without the moving window, I know that I can rank and get the data but not sure on how to even approach this problem. Any help?
Per updated question my updated solution gets much longer and changes quite a bit.
I am still not sure if user selects from which week you need another 52 weeks or if you are looking at this calculation from start (week 1) of every year.
I also assume that you have a typo in one of your insert statements when I compare to your desired output table. So I changed it to fit your output table.
1. Create table
create table table.defects(
product string
,defect string
,qty int64
,fwkyr int64
,fwenddate date
);
2. Insert data (adjusted last insert to match your output table)
insert into table.defects values ('iPhone','Glass breakage',100,202112,'2020-09-20');
insert into table.defects values ('iPhone','No sound',30,202111,'2020-09-30');
insert into table.defects values ('iPhone','Glass breakage',25,202110,'2020-09-06');
insert into table.defects values ('iPhone','Audio problem',20,202109,'2020-08-30');
insert into table.defects values ('iPhone','No sound',60,202108,'2020-08-23');
insert into table.defects values ('iPhone','Empty boxes',30,202107,'2020-08-16');
insert into table.defects values ('iPhone','Audio problem',55,202106,'2020-08-09');
3. Query for results
###############################################################################
### start count of weeks since selected first week and
### get number of weeks by desired range
###############################################################################
WITH
get_weeks AS (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY product ORDER BY fwkyr DESC) AS week_numbering,
SPLIT(CAST(ROW_NUMBER() OVER(PARTITION BY product ORDER BY fwkyr)/4 AS string), '.')[
OFFSET
(0)] AS week_id_0,
FROM
table.defects
ORDER BY
fwkyr DESC
),
###############################################################################
### produce filter column for each window period by offsetting
###############################################################################
get_weeks_consequtive AS (
SELECT
*,
LAG(week_id_0,1) OVER(PARTITION BY product ORDER BY fwkyr DESC) AS week_id_1,
LAG(week_id_0,2) OVER(PARTITION BY product ORDER BY fwkyr DESC) AS week_id_2,
LAG(week_id_0,3) OVER(PARTITION BY product ORDER BY fwkyr DESC) AS week_id_3
FROM
get_weeks ),
###############################################################################
### create tables and calculations per window using filter column where you group by for qty and keep top qty only
###############################################################################
week_id_0 AS (
SELECT
SUM(qty) AS qty,
product,
defect,
week_id
FROM (
SELECT
* EXCEPT(week_id_0,
week_id_1,
week_id_2,
week_id_3),
MAX(fwkyr) OVER() AS week_id
FROM
get_weeks_consequtive
WHERE
week_id_0 = '1' )
GROUP BY
2,
3,
4
ORDER BY
1 DESC
LIMIT
1),
week_id_1 AS (
SELECT
SUM(qty) AS qty,
product,
defect,
week_id
FROM (
SELECT
* EXCEPT(week_id_0,
week_id_1,
week_id_2,
week_id_3),
MAX(fwkyr) OVER() AS week_id
FROM
get_weeks_consequtive
WHERE
week_id_1 = '1' )
GROUP BY
2,
3,
4
ORDER BY
1 DESC
LIMIT
1),
week_id_2 AS (
SELECT
SUM(qty) AS qty,
product,
defect,
week_id
FROM (
SELECT
* EXCEPT(week_id_0,
week_id_1,
week_id_2,
week_id_3),
MAX(fwkyr) OVER() AS week_id
FROM
get_weeks_consequtive
WHERE
week_id_2 = '1' )
GROUP BY
2,
3,
4
ORDER BY
1 DESC
LIMIT
1),
week_id_3 AS (
SELECT
SUM(qty) AS qty,
product,
defect,
week_id
FROM (
SELECT
* EXCEPT(week_id_0,
week_id_1,
week_id_2,
week_id_3),
MAX(fwkyr) OVER() AS week_id
FROM
get_weeks_consequtive
WHERE
week_id_3 = '1' )
GROUP BY
2,
3,
4
ORDER BY
1 DESC
LIMIT
1)
###############################################################################
### union all selected windows
###############################################################################
SELECT
*
FROM
week_id_0
UNION ALL
SELECT
*
FROM
week_id_1
UNION ALL
SELECT
*
FROM
week_id_2
UNION ALL
SELECT
*
FROM
week_id_3
ORDER BY
week_id DESC
get_weeks
get_weeks_consequtive
week_id_1
result
PS ---
I brainstormed this quick per your update perhaps there is a better way and I would be interested in seeing it.
Anyhow, with such lengthy queries I typically produce a python script with text templates for repetitive parts and use a loop to expand repetitive parts to desired lengths by incrementing changing values and inserting them with so called f strings.

SQL Query - second ID of a list ordered by date and ID

I have a SQL database with a list of Customer IDs CustomerID and invoices, the specific product purchased in each invoice ProductID, the Date and the Income of each invoice . I need to write a query that will retrieve for each product, which was the second customer who made a purchase
How do I do that?
EDIT:
I have come up with the following query:
SELECT *,
LEAD(CustomerID) OVER (ORDER BY ProductID, Date) AS 'Second Customer Who Made A Purchase'
FROM a
ORDER BY ProductID, Date ASC
However, this query presents multiple results for products that have more than two purchases. Can you advise?
SELECT a2.ProductID,
(
SELECT a1.CustomerID
FROM a a1
WHERE a1.ProductID = a2.ProductID
ORDER BY Date asc
LIMIT 1,1
) as SecondCustomer
FROM a a2
GROUP BY a2.ProductID
I need to write a query that will retrieve for each product, which was the second customer who made a purchase
This sounds like a window function:
select a.*
from (select a.*,
row_number() over (partition by productid order by date asc) as seqnum
from a
) a
where seqnum = 2;

Get most expensive and cheapest items from two tables

I'm trying to get the most expensive and cheapest items from two different tables.
The output should be one row with the values for MostExpensiveItem, MostExpensivePrice, CheapestItem, CheapestPrice
I was able to get the price of the most expensive and cheapest items in the two tables with following query:
SELECT
MAX(ExtrasPrice) as MostExpensivePrice, MIN(ExtrasPrice) as CheapestPrice
FROM
(
SELECT ExtrasPrice FROM Extras
UNION ALL
SELECT ItemPrice FROM Items
) foo
How can I add the names of the items (ItemName, ExtrasName) to my output? Again, there should only be one row as the output.
Try this:
SELECT TOP 1 FIRST_VALUE(Price) OVER (ORDER BY Price) AS MinPrice,
FIRST_VALUE(Name) OVER (ORDER BY Price) AS MinName,
LAST_VALUE(Price) OVER (ORDER BY Price DESC) AS MaxPrice,
LAST_VALUE(Name) OVER (ORDER BY Price DESC) AS MaxName
FROM (
SELECT ExtrasName AS Name, ExtrasPrice AS Price FROM Extras
UNION ALL
SELECT ItemName As Name, ItemPrice AS Price FROM Items) u
SQL Fiddle Demo
TOP 1 with order by clause should work for you. Try this
SELECT *
FROM (SELECT TOP 1 ExtrasPrice,ExtrasName
FROM Extras ORDER BY ExtrasPrice Asc),
(SELECT TOP 1 ItemPrice,ItemName
FROM Items ORDER BY ItemPrice Desc)
Note: Comma can be replaced with CROSS JOIN
You can use row_number() for this. If you are satisfied with two rows:
SELECT item, price
FROM (SELECT foo.*, row_number() over (order by price) as seqnum_asc,
row_number() over (order by price) as seqnum_desc
FROM (SELECT item, ExtrasPrice as price FROM Extras
UNION ALL
SELECT item, ItemPrice FROM Items
) foo
) t
WHERE seqnum_asc = 1 or seqnum_desc = 1;
EDIT:
If you have an index on "price" in both tables, then the cheapest method is probably:
with exp as (
(select top 1 item, ExtrasPrice as price
from Extras e
order by price desc
) union all
(select top 1 i.item, ItemPrice
from Items i
order by price desc
)
),
cheap as (
(select top 1 item, ExtrasPrice as price
from Extras e
order by price asc
) union all
(select top 1 i.item, ItemPrice
from Items i
order by price asc
)
)
select top 1 *
from exp
order by price desc
union all
select top 1 *
from cheap
order by price asc;
If you want this in one row, you can replace the final query with:
select e.*, c.*
from (select top 1 *
from exp
order by price desc
) e cross join
(select top 1 *
from cheap
order by price asc
) c

T-Sql find duplicate row values

I want to write a stored procedure.
In that stored procedure, I want to find duplicate row values from a table, and calculate sum operation on these rows to the same table.
Let's say, I have a CustomerSales table;
ID SalesRepresentative Customer Quantity
1 Michael CustA 55
2 Michael CustA 10
and I need to turn table to...
ID SalesRepresentative Customer Quantity
1 Michael CustA 65
2 Michael CustA 0
When I find SalesRepresentative and Customer duplicates at the same time, I want to sum all Quantity values of these rows and assign to the first row of a table, and others will be '0'.
Could you help me.
To aggregate duplicates into one row:
SELECT min(ID) AS ID, SalesRepresentative, Customer
,sum(Quantity) AS Quantity
FROM CustomerSales
GROUP BY SalesRepresentative, Customer
ORDER BY min(ID)
Or, if you actually want those extra rows with 0 as Quantity in the result:
SELECT ID, SalesRepresentative, Customer
,CASE
WHEN (count(*) OVER (PARTITION BY SalesRepresentative,Customer)) = 1
THEN Quantity
WHEN (row_number() OVER (PARTITION BY SalesRepresentative,Customer
ORDER BY ID)) = 1
THEN sum(Quantity) OVER (PARTITION BY SalesRepresentative,Customer)
ELSE 0
END AS Quantity
FROM CustomerSales
ORDER BY ID
This makes heavy use of window functions.
Alternative version without window functions:
SELECT min(ID) AS ID, SalesRepresentative, Customer, sum(Quantity) AS Quantity
FROM CustomerSales
GROUP BY SalesRepresentative, Customer
UNION ALL
SELECT ID, SalesRepresentative, Customer, 0 AS Quantity
FROM CustomerSales c
GROUP BY SalesRepresentative, Customer
LEFT JOIN (
SELECT min(ID) AS ID
FROM CustomerSales
GROUP BY SalesRepresentative, Customer
) x ON (x.ID = c.ID)
WHERE x.ID IS NULL
ORDER BY ID