ID appears twice in query when multiplied with different Prices - sql

I am trying to get the top 5 stations by Sales,
but I ran into the problem that one station appears twice if multiplied by a different price.
This is my query:
SELECT distinct b_id, count(t_start_id) * v_preis AS START_PRICE
FROM bahnhof
INNER JOIN tickets
ON t_start_id = b_id
INNER JOIN connections
ON t_connection_id = v_id
GROUP BY b_id, v_preis
ORDER BY START_PRICE DESC LIMIT 5;
Which gives me the following result:
b_id
START_PRICE
7
75
6
50
4
30
1
16
1
15
What i need though is:
b_id
START_PRICE
7
75
6
50
1
31
4
30
I tried to group by ID only, but it didn't work since v_preis had to be in there too.
The price for 1 is 8 twice and 15 once, so I guess I have a problem with using different rows for one result.
I'm pretty new to SQL, so I'm sorry if this is a dumb question,
thank you in advance!

Did you try using SUM() aggregation along with only grouping by id?
SELECT DISTINCT b_id, SUM(v_preis) AS start_price
FROM bahnhof
JOIN tickets
ON t_start_id = b_id
JOIN connections
ON t_connection_id = v_id
GROUP BY b_id
ORDER BY START_PRICE DESC
LIMIT 5;

Related

Join tables based on dates with check

I have two tables in PostgreSQL:
Demans_for_parts:
demandid partid demanddate quantity
40 125 01.01.17 10
41 125 05.01.17 30
42 123 20.06.17 10
Orders_for_parts:
orderid partid orderdate quantity
1 125 07.01.17 15
54 125 10.06.17 25
14 122 05.01.17 30
Basicly Demans_for_parts says what to buy and Orders_for_parts says what we bought. We can buy parts which do not list on Demans_for_parts.
I need a report which shows me all parts in Demans_for_parts and how many weeks past since the most recent matching row in Orders_for_parts. note quantity field is irrelevent here,
The expected result is (if more than one row per part show the oldes):
partid demanddate weeks_since_recent_order
125 01.01.17 2 (last order is on 10.06.17)
123 20.06.17 Unhandled
I think the tricky part is getting one row per table. But that is easy using distinct on. Then you need to calculate the months. You can use age() for this purpose:
select dp.partid, dp.date,
(extract(year from age(dp.date, op.date))*12 +
extract(month from age(dp.date, op.date))
) as months
from (select distinct on (dp.partid) dp.*
from demans_for_parts dp
order by dp.partid, dp.date desc
) dp left join
(select distinct on (op.partid) op.*
from Orders_for_parts op
order by op.partid, op.date desc
) op
on dp.partid = op.partid;
smth like?
with o as (
select distinct partid, max(orderdate) over (partition by partid)
from Orders_for_parts
)
, p as (
select distinct partid, min(demanddate) over (partition by partid)
from Demans_for_parts
)
select p.partid, min as demanddate, date_part('day',o.max - p.min)/7
from p
left outer join o on (p.partid = o.partid)
;

SQL counting query

Sorry if this is a basic question.
Basically, I have a table that is as follows, below is a basic sample
store-ProdCode-result
13p I10x 5
13p I20x 7
13p I30x 8
14a K38z 23
17a K38z 23
my data set has nearly 100,000 records.
What I'm trying to do is, for every store find the top 10 prodCode.
I am unsure of how to do this but what I tried was:
select s_code as store, prod_code,count (prod_code)
from top10_secondary
where prod_code is not null
group by store,prod_code
order by count(prod_code) desc limit 10
this is giving me something completely different and i'm unsure on how I go about achieving my final result.
All help is appreciated.
Thanks
The expected output should be: for every store(s_code) display the top 10 prodcode
so:
store--prodcode--result
1a abc 5
1a abd 4
2a dgf 1
2a ldk 6
.(10 times until next store code)
You can use the table twice in the FROM clause, once for the data, and once to get a count of how many records have fewer results for that store.
SELECT a.s_code, a.prod_code, count(*)
FROM top10_secondary a
LEFT OUTER JOIN top10_secondary b
ON a.s_code = b.s_code
AND b.result < a.result
GROUP BY a.s_code, a.prod_code
HAVING count(*) < 10
With this technique though, you may get more than 10 records per store if the 10th result value exists multiple times. Because the limit rule is simply "include record as long as there are less than 10 records with result values than mine"
It looks like in your case, "result" is a ranking, so they would not be duplicated per store.
This is a good case for Window functions.
SELECT
s_code,
prod_code,
prod_count
FROM
(
SELECT
s_code,
prod_code,
prod_count,
RANK() OVER (PARTITION BY s_code ORDER BY prod_Count DESC) as prod_rank
FROM
(SELECT s_code as store, prod_code, count(prod_Code) prod_count FROM table GROUP BY s_code, prod_code) t1
) t2
WHERE prod_rank <= 10
The inner most query gets the count of each product at the store. The second inner more query determines the rank for those products for each store based on that count. Then the outer most query limits the results based on that rank.
o

returning only one row for each value of a column along with other values in different columns

I am working on a query on a SQL table which has several columns along with several rows of data and the query returns one row for each unique first and second columns based on the criteria given in the query.
For Example, I have the following table CC
product term bid offer bidcp offercp
AA sep14 20 10 x y
AA Sep14 15 9 p q
BA Sep14 30 15 as ps
XY Sep14 25 15 r t
XY Oct14 30 20 t r
XY Oct14 25 22 p q
When I run the query on the above table it should return the following data
product term bid offer bidcp offercp
AA sep14 20 9 x q(coming from a record which has lowest offer)
BA Sep14 30 15 as ps
XY Sep14 25 15 r t
XY Oct14 30 20 t r
When I executed the following query it grouped the data in CC even by bidcp and offercp and returned almost all the rows as both offercp and bidcp are unique in one or the other way but I just wanted bidcp and offercp to be where bid and offer are coming from assuming pair of both bid and offer are unique for each product and term
select product,term,max(bid) as bid,min(offer) as offer,bidcp,offercp from canadiancrudes where product like '%/%' group by product,term,bidcp,offercp
But, when I removed bidcp and offercp from groupby clause it threw me an obvious error
Column 'CC.BidCP' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Is there a better way to fix it?
In that case, you need 2 CTEs:
WITH o AS (
SELECT product,term,offer,offercp, ROW_NUMBER() OVER (PARTITION BY product, term ORDER BY offer ASC) AS rn
FROM canadiancrudes where product like '%/%'
)
, b AS (
SELECT product,term,bid,bidcp, ROW_NUMBER() OVER (PARTITION BY product, term ORDER BY bid DESC) AS rn
FROM canadiancrudes where product like '%/%'
)
SELECT o.product,o.term,b.bid,o.offer,b.bidcp,o.offercp
FROM o
INNER JOIN b
ON o.product=b.product
AND o.term=b.term
WHERE o.rn=1
AND b.rn=1
Use a CTE to get the min, max value -
WITH MaxMin_CTE AS (
SELECT product,term,max(bid) as bid,min(offer) AS Offer
FROM CC
GROUP BY product,term)
SELECT * from CC
INNER JOIN MaxMin_CTE ON CC.product = MaxMin_CTE .product
AND CC.bid= MaxMin_CTE.bid AND CC.Offer = MaxMin_CTE.offer
AND CC.Term = MaxMin_CTE.Term
Heres the SQL fiddle -
http://sqlfiddle.com/#!6/a6588/2

counting subquery in SQL

I have the following query to count how many times each process_track_id occurs in a table:
SELECT
a.process_track_id,
COUNT(1) AS 'num'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
This returns the following results:
process_track_id | num
1 14
2 44
3 16
5 8
6 18
7 17
8 14
This is great. Now is the part where I am stuck. I would like to get the following table:
num count
8 1
14 2
16 1
17 1
18 1
44 1
Where num are the distinct counts from the first table, and count is how many times that frequency occurs.
Here is what I have tried (it's a subquery, but I'm not sold on the method) and I haven't been able to get it to work just yet. I'm new to SQL and I think I'm missing out on some some key aspects of the syntax.
SELECT
X.id_count,
count(1) as 'num_count'
FROM
(SELECT
a.process_track_id,
COUNT(1) AS 'id_count'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
--COUNT(1) AS 'id_count'
) X;
Any ideas?
It's probably good to keep in mind that this may have to be run on a database with at least 1 million records, and I don't have the ability to create a new table in the process.
Thanks!
Here's the subquery method you were driving at:
SELECT id_count, COUNT(*) AS 'num_count'
FROM (SELECT a.process_track_id
,COUNT(*) AS 'id_count'
FROM transreport.process_name a
GROUP BY a.process_track_id
)sub
GROUP BY id_count
Not sure there's a better method as the aggregation needs to run once anyway.
Try this
SELECT x.num, COUNT(*) AS COUNT
FROM (
SELECT
a.process_track_id, -- <--- You may removed this column
COUNT(*) AS 'num'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
) X
GROUP BY X.num

greatest N per group with padding

I've been trying to solve this problem over the weekend, without luck so far. I have two tables:
TopOffers:
OfferId RetailerId Order
1 38 0
2 8 3
3 17 2
4 22 1
And Offers:
Id RetailerId Name Description etc...
1 3 Strawberry Red and smelly
2 38 Cookie Crunchy
3 17 Onion Of the nice kind
4 22 Apple Cheap
5 8 Toothbrush Lasts extra long!
My goal is to get the top 10 Offers for each Retailer ID. The order in which they should be listed is specified by the Order field in the TopOffer table (Sort order is Ascending). On top of that, the result should be padded to 10 offers when there are less than 10 TopOffer records for a retailer. The TopOffer table always contains 10 or less records per retailer.
So far I've managed to get this going, which works (I realize it doesn't get the top 10, but rather everything that's in the TopOffer table, which is alright, since the TopOffer table is always equal to or smaller than the top 10 for any retailer):
SELECT b.*
FROM
(
SELECT o.Id, to.`Order` FROM Offer AS o
LEFT JOIN TopOffer AS to
ON o.Id = to.OfferId
) AS a,
(
SELECT o.*, to.`Order` FROM Offer AS o
LEFT JOIN TopOffer AS to
ON o.Id = to.OfferId
) AS b
WHERE a.`Order` >= b.`Order` AND a.Id = b.Id
GROUP BY b.RetailerId, b.Id
HAVING Count(1) BETWEEN 1 AND 10
ORDER BY RetailerId, `Order` ASC
Unfortunately I can't seem to find any way of padding the result of this query with offers that don't have an entry in the TopOffer table if there aren't 10 TopOffer records for that retailer.
My sincerest thanks in advance for any help!
If you create a virtual table with numbers 1-10 you can left join to your results to get 10 of each
select number, results.*
from
(select 1 as number union select 2 union select 3 ... union select 10) numbers
left join
(your query here) results
on numbers.number = results.rank