Select pairs of values based on condition in other column - PostgreSQL - sql

I've been trying to solve an issue for the past couple of days, but couldn't figure out what the solution would be...
I have a table as the following:
+--------+-----------+-------+
| ShopID | ArticleID | Price |
+--------+-----------+-------+
| 1 | 3 | 150 |
| 1 | 2 | 80 |
| 3 | 3 | 100 |
| 4 | 2 | 95 |
+--------+-----------+-------+
And I woud like to select pairs of shop IDs for which the price of the same article is higher.
F.e. this should look like:
+----------+----------+---------+
| ShopID_1 | ShopID_2 |ArticleID|
+----------+----------+---------+
| 4 | 1 | 2 |
| 1 | 3 | 3 |
+----------+----------+---------+
... showing that Article 2 ist more expensive in ShopID 4 than in ShopID 2. Etc
My code so far looks as following:
SELECT ShopID AS ShopID_1, ShopID AS ShopID_2, ArticleID FROM table
WHERE table.ArticleID=table.ArticleID and table.Price > table.Price
But it doesn't give the result I am searching for.
Can anyone help me with this objective? Thank you very much.

The problem here is about calculating Top N items per Group.
Assuming you have the following data, in table sales.
# select * from sales;
shopid | articleid | price
--------+-----------+-------
1 | 2 | 80
3 | 3 | 100
4 | 2 | 95
1 | 3 | 150
5 | 3 | 50
With the following query we can create a partition for each ArticleId
select
ArticleID,
ShopID,
Price,
row_number() over (partition by ArticleID order by Price desc) as Price_Rank from sales;
This will result:
articleid | shopid | price | price_rank
-----------+--------+-------+------------
2 | 4 | 95 | 1
2 | 1 | 80 | 2
3 | 1 | 150 | 1
3 | 3 | 100 | 2
3 | 5 | 50 | 3
Then we simply select Top 2 items for each AritcleId:
select
ArticleID,
ShopID,
Price
from (
select
ArticleID,
ShopID,
Price,
row_number() over (partition by ArticleID order by Price desc) as Price_Rank
from sales) sales_rank
where Price_Rank <= 2;
which will result:
articleid | shopid | price
-----------+--------+-------
2 | 4 | 95
2 | 1 | 80
3 | 1 | 150
3 | 3 | 100
Finally, we can use crosstab function to get the expected pivot view.
select *
from crosstab(
'select
ArticleID,
ShopID,
ShopID
from (
select
ArticleID,
ShopID,
Price,
row_number() over (partition by ArticleID order by Price desc) as Price_Rank
from sales) sales_rank
where Price_Rank <= 2')
AS sales_top_2("ArticleID" INT, "ShopID_1" INT, "ShopID_2" INT);
And the result:
ArticleID | ShopID_1 | ShopID_2
-----------+----------+----------
2 | 4 | 1
3 | 1 | 3
Note:
You may need to call CREATE EXTENSION tablefunc; in case if you get the error function crosstab(unknown) does not exist.

This query should work:
SELECT t1.ShopID AS ShopID_1, t2.ShopID AS ShopID_2, t1.ArticleID
FROM <yourtable> t1 JOIN
<yourtable> t2
ON t1.ArticleID = t2.ArticleID AND t1.Price > t2.Price;
That is, you need a self-join and appropriate table aliases.

Related

count total items, sold items (in another table reference by id) and grouped by serial number

I have a table of items in the shop, an item may have different entries with same serial number (sn) (but different ids) if the same item was bought again later on with different price (price here is how much did a single item cost the shop)
id | sn | amount | price
----+------+--------+-------
1 | AP01 | 100 | 7
2 | AP01 | 50 | 8
3 | X2P0 | 200 | 12
4 | X2P0 | 30 | 18
5 | STT0 | 20 | 20
6 | PLX1 | 200 | 10
and a table of transactions
id | item_id | price
----+---------+-------
1 | 1 | 10
2 | 1 | 9
3 | 1 | 10
4 | 2 | 11
5 | 3 | 15
6 | 3 | 15
7 | 3 | 15
8 | 4 | 18
9 | 5 | 22
10 | 5 | 22
11 | 5 | 22
12 | 5 | 22
and transaction.item_id references items(id)
I want to group items by serial number (sn), get their sum(amount) and avg(price), and join it with a sold column that counts number of transactions with referenced id
I did the first with
select i.sn, sum(i.amount), avg(i.price) from items i group by i.sn;
sn | sum | avg
------+-----+---------------------
STT0 | 20 | 20.0000000000000000
PLX1 | 200 | 10.0000000000000000
AP01 | 150 | 7.5000000000000000
X2P0 | 230 | 15.0000000000000000
Then when I tried to join it with transactions I got strange results
select i.sn, sum(i.amount), avg(i.price) avg_cost, count(t.item_id) sold, sum(t.price) profit from items i left join transactions t on (i.id=t.item_id) group by i.sn;
sn | sum | avg_cost | sold | profit
------+-----+---------------------+------+--------
STT0 | 80 | 20.0000000000000000 | 4 | 88
PLX1 | 200 | 10.0000000000000000 | 0 | (null)
AP01 | 350 | 7.2500000000000000 | 4 | 40
X2P0 | 630 | 13.5000000000000000 | 4 | 63
As you can see, only the sold and profit columns show correct results, the sum and avg show different results than the expected
I can't separate the statements because I am not sure how can I add the count to the sn group which has the item_id as its id?
select
j.sn,
j.sum,
j.avg,
count(item_id)
from (
select
i.sn,
sum(i.amount),
avg(i.price)
from items i
group by i.sn
) j
left join transactions t
on (j.id???=t.item_id);
There are multiple matches in both tables, so the join multiplies the rows (and eventually produces wron results). I would recommend pre-joining, then aggregating:
select
sn,
sum(amount) total_amount,
avg(price) avg_price,
sum(no_transactions) no_transactions
from (
select
i.*,
(
select count(*)
from transactions t
where t.item_id = i.id
) no_transactions
from items i
) t
group by sn

sum last values and group by

I have "steps" table like this
id | points | game_id | price | user_id | timestamp | some | additional | fields
it contains game information.
I have a code which can group by game_id
SELECT game_id, MIN(timestamp),
(SELECT points FROM steps as t2 WHERE t2.game_id = t1.game_id ORDER BY t2.id DESC LIMIT 1) as last_point
WHERE user_id = 1
GROUP BY game_id
but I want to group by price and summarize each last point of the game. my query is
SELECT COUNT(DISTINCT game_id) as game_count, COUNT(id) as step_count, SUM(points), price
FROM steps WHERE user_id = 1
GROUP BY price
But this query returns a sum of all points while I need a sum of the last point in each game.
Please point me to the right way
Example result
last_points_sum | game_count | step_count | price
200 | 2 | 3 | 100
400 | 3 | 4 | 200
where table is
id | points | game_id | price | user_id | timestamp
1 | 10 | 5 | 100 | 1 | 100000001
2 | 200 | 5 | 100 | 1 | 100000002
3 | 200 | 6 | 200 | 1 | 100000003
4 | 0 | 6 | 200 | 1 | 100000004
5 | 400 | 6 | 200 | 1 | 100000005
Is this what you're looking for?
This assumes that timestamp is unique, at least for each instance of game_id.
SELECT
COUNT(DISTINCT game_id) AS game_count,
COUNT(id) AS step_count,
SUM(COALESCE(ltIsLastPoints, 0.0) * points),
price
FROM
(SELECT
game_id ltGameID,
MAX(timestamp) ltTimestamp,
1.0 ltIsLastPoints
FROM
steps
GROUP BY
game_id
) lt RIGHT JOIN
steps
ON ltGameID = game_id
AND ltTimestamp = timestamp
WHERE
user_id = 1
GROUP BY
price;
Your description says you want to group by points but your example query groups by price. I went with price.

SQL value of previous (unknown) date

I have a table with articles, Day Date, and amount of bought. I want a reslut table where I can see the amount off all Articles and how many where bought and the amount they where bought at the unknown date before:
Example
result of:
select articleid, amount, date from table1 where articleid in(7,8)
|------------|---------|----------|
| articleid | amount | date |
|------------|---------|----------|
| 7 | 34 |20.10.2019|
|------------|---------|----------|
| 7 | 2 |15.10.2019|
|------------|---------|----------|
| 8 | 12 |13.10.2019|
|------------|---------|----------|
| 8 | 35 |15.09.2019|
|------------|---------|----------|
The result should look like:
|------------|---------|----------|----------|----------|
| articleid | amount | date |prev date |prevamount|
|------------|---------|----------|----------|----------|
| 7 | 34 |20.10.2019|15.10.2019| 2 |
|------------|---------|----------|----------|----------|
| 7 | 2 |15.10.2019| | |
|------------|---------|----------|----------|----------|
| 8 | 12 |13.10.2019|15.09.2019| 35 |
|------------|---------|----------|----------|----------|
| 8 | 35 |15.09.2019| | |
|------------|---------|----------|----------|----------|
Is this anyway possibile to do?
Best
Zio
You want lag():
select
articleid,
amount,
date,
lag(date) over(partition by articleid order by date) prevdate,
lag(amount) over(partition by articleid order by date) prevamount
from table1
order by articleid, date desc

Top Dense_Rank row based on other fields

I have several tables tied together in sql that I am trying to display only the MAX number from a column formulated using DENSE RANK but I need to keep in mind 2 other fields when pulling the TOP row.
Here is a sample of my result:
| sa_id | price | threshold | role_id | rk
1 | 37E41 | 40.00 | NULL | A38D67A | 1
2 | 37E41 | 40.00 | NULL | 46B9D4E | 1
3 | 1CFC1 | 40.00 | NULL | 58C1E03 | 1
4 | BF0D3 | 40.00 | NULL | 28D465B | 1
5 | F914B | 40.00 | NULL | 2920EBD | 1
6 | F3CA1 | 40.00 | NULL | D5E7584 | 1
7 | 0D8C1 | 40.00 | NULL | EECDB5A | 1
8 | A6503 | 40.00 | NULL | B680CB4 | 1
9 | 9BB96 | 40.00 | 0.01 | D66E612 | 1
10 | 9BB96 | 40.00 | 20.03 | D66E612 | 2
11 | 9BB96 | 40.00 | 40.03 | D66E612 | 3
12 | 9BB96 | 40.00 | 60.03 | D66E612 | 4
13 | 9BB96 | 40.00 | 80.03 | D66E612 | 5
What I am hoping to accomplish is to display all columns in this screenshot using the highest value for rk (calculated using DENSE RANK) where price > threshold and the sa_id & role_id are unique.
In this case I would want to display the following rows only: 1, 2, 3, 4, 5, 6, 7, 8, 10
Is this possible?
SELECT
servicerate_audit_id as sa_id
,ticket_price as price
,threshold_threshold/100.00 as threshold
,charge_role.chargerole_id as role_id
,DENSE_RANK() OVER(
PARTITION BY threshold_audit_id
ORDER BY
ISNULL(threshold_threshold,9999999),
threshold_threshold
) as rk
FROM sts_service_charge_rate
INNER JOIN ts_threshold
ON threshold_id = servicerate_threshold_id
INNER JOIN ts_charge_role as charge_role
ON chargerole_id = servicerate_charge_role_id
If you can modify your original query:
SELECT *
FROM (
SELECT
servicerate_audit_id as sa_id
,ticket_price as price
,threshold_threshold/100.00 as threshold
,charge_role.chargerole_id as role_id
,DENSE_RANK() OVER(
PARTITION BY threshold_audit_id
ORDER BY
ISNULL(threshold_threshold,9999999),
threshold_threshold
) as rk
,DENSE_RANK() OVER(
ORDER BY
ISNULL(threshold_threshold,9999999) DESC,
threshold_threshold DESC
) as rk_inverse
FROM sts_service_charge_rate
INNER JOIN ts_threshold
ON threshold_id = servicerate_threshold_id
INNER JOIN ts_charge_role as charge_role
ON chargerole_id = servicerate_charge_role_id
) t
WHERE price > COALESCE(threshold, 0)
AND t.rk_inverse = 1
Observe I just added an inverse calculation of your ranking and filtered for the top rk_inverse per partition. I'm assuming that the PARTITION BY threshold_audit_id and your requirement of having unique (sa_id, role_id) tuples are functionally dependent. Otherwise, your rk_inverse calculation would need to take into consideration a different PARTITION BY clause.
If you cannot modify your original query:
You can calculate another window function that orders your rk values descendingly (highest first) per your partition (sa_id, role_id), and then take only the top one per partition:
SELECT sa_id, price, threshold, role_id, rk
FROM (
SELECT result.*, row_number() OVER (PARTITION BY sa_id, role_id ORDER BY rk DESC) rn
FROM (... original query ...)
WHERE price > COALESCE(threshold, 0)
) t
WHERE rn = 1

Create a sub query for sum data as a new column in SQL Server

Suppose that I have a table name as tblTemp which has data as below:
| ID | AMOUNT |
----------------
| 1 | 10 |
| 1-1 | 20 |
| 1-2 | 30 |
| 1-3 | 40 |
| 2 | 50 |
| 3 | 60 |
| 4 | 70 |
| 4-1 | 80 |
| 5 | 90 |
| 6 | 100 |
ID will be format as X (without dash) if it's only one ID or (X-Y) format if new ID (Y) is child of (X).
I want to add a new column (Total Amount) to output as below:
| ID | AMOUNT | Total Amount |
---------------------------------
| 1 | 10 | 100 |
| 1-1 | 20 | 100 |
| 1-2 | 30 | 100 |
| 1-3 | 40 | 100 |
| 2 | 50 | 50 |
| 3 | 60 | 60 |
| 4 | 70 | 150 |
| 4-1 | 80 | 150 |
| 5 | 90 | 90 |
| 6 | 100 | 100 |
The "Total Amount" column is the calculate column which sum value in Amount column that the (X) in ID column is the same.
In order to get parent ID (X), I use the following SQL:
SELECT
ID, SUBSTRING (ID, 1,
IIF (CHARINDEX('-', ID) = 0,
len(ID),
CHARINDEX('-', ID) - 1)
), Amount
FROM
tblTemp
How Can I query like this in SQL Server 2012?
You can use sqlfiddle here to test it.
Thank You
Pengan
You have already done most of the work. To get the final result you can use your existing query and make it a subquery or use a CTE, then use sum() over() to get the result:
;with cte as
(
SELECT
ID,
SUBSTRING (ID, 1,
IIF (CHARINDEX('-', ID) = 0,
len(ID),
CHARINDEX('-', ID) - 1)
) id_val, Amount
FROM tblTemp
)
select id, amount, sum(amount) over(partition by id_val) total
from cte
See SQL Fiddle with Demo
You can do this using the sum() window function:
select id, amount,
SUM(amount) over (partition by (case when id like '%-%'
then left (id, charindex('-', id) - 1)
else id
end)
) as TotalAmount
from tblTemp t