order by count(catid) without group - sql

I want count how many rows use the same catid and order the query by total.
id | catid | name
0 | 1 | foo
1 | 1 | bar
2 | 2 | paint
I've tried COUNT(catid) but this requires a GROUP BY, and I do not want to compress rows.
How may I do this?

Do you want window functions?
select t.*, count(*) over (partition by catid) as cat_cnt
from t
order by cat_cnt, catid;
I should note that if you don't want to see the total, you can put the window function in the order by:
select *
from t
order by count(*) over (partition by catid), catid

Maybe you could run the GROUP BY as a separate SELECT, then JOIN?
E.g.
select orig.*, summ.totals
from t
join (select count(cat_id) totals
from t
group by cat_id) summ
on t.cat_id = summ.cat_id;

Related

Select arbitrary row for each group in Postgres

In Presto, there's an arbitrary() aggregate function to select any arbitrary row in a given group. If there's no group by clause, then I can use distinct on. With group by, every selected column must be in the group by or be an aggregated column. E.g.:
| id | foo |
| 1 | 123 |
| 1 | 321 |
select id, arbitrary(foo), count(*)
from mytable
group by id
Fiddle
It doesn't matter if it returns 1, 123, 2 or 1, 321, 2. Something like min() or max() works, but it's a lot slower.
Does something like arbitrary() exist in Postgres?
select m.foo,b.id,b.cnt from mytable m
join (select id, count(*) cnt
from mytable
group by id) b using (id) limit 1;
If not explicit mention asc, desc all the order is not guaranteed. Therefore in the above query the foo's appearance is arbitrary.

Is there a way to calculate average based on distinct rows without using a subquery?

If I have data like so:
+----+-------+
| id | value |
+----+-------+
| 1 | 10 |
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
| 2 | 20 |
+----+-------+
How do I calculate the average based on the distinct id WITHOUT using a subquery (i.e. querying the table directly)?
For the above example it would be (10+20+30)/3 = 20
I tried to do the following:
SELECT AVG(IF(id = LAG(id) OVER (ORDER BY id), NULL, value)) AS avg
FROM table
Basically I was thinking that if I order by id and check the previous row to see if it has the same id, the value should be NULL and thus it would not be counted into the calculation, but unfortunately I can't put analytical functions inside aggregate functions.
As far as I know, you can't do this without a subquery. I would use:
SELECT AVG(avg_value)
FROM
(
SELECT AVG(value) AS avg_value
FROM yourTable
GROUP BY id
) t;
WITH RANK AS (
Select *,
ROW_NUMBER() OVER(PARTITION BY ID) AS RANK
FROM
TABLE
QUALIFY RANK = 1
)
SELECT
AVG(VALUES)
FROM RANK
The outer query will have other parameters that need to access all the data in the table
I interpret this comment as wanting an average on every row -- rather than doing an aggregation. If so, you can use window functions:
select t.*,
avg(case when seqnum = 1 then value end) over () as overall_avg
from (select t.*,
row_number() over (partition by id order by id) as seqnum
from t
) t;
Yes there is a way,
Simply use distinct inside the avg function as below :
select avg(distinct value) from tab;
http://sqlfiddle.com/#!4/9d156/2/0

Exclude columns from Group by

I have a table like this
My current query
Select team,
stat_id,
max(statsval) as statsval
from tbl
group by team,
statid
Issue :
I need to get season also in select and obliviously I need to add to group by but is is giving me un expected results I can't change my group by.Because I need to group by stat_id only I can group by season. I need to get the season of the max() record. Can some one help me on this?
I even tried
Select team,
stat_id,
max (seasonid),
max(statsval) as statsval
from tbl
group by team,
statid
But it takes the max season not exactly the correct result.
Excepted result
+--------+--------+-------+---------+---------+
| season | team | round | stat_id | statval |
+--------+--------+-------+---------+---------+
| 2004 | 500146 | 3 | 1 | 5 |
| 2007 | 500147 | 1 | 1 | 4 |
+--------+--------+-------+---------+---------+
Depending on your edition of SQL Server, this can be done with Window functions only:
SELECT DISTINCT team
, stat_id
, max(statsval) OVER (PARTITION BY team, stat_id) statsval
, FIRST_VALUE(season_id) OVER (PARTITION BY team, stat_id ORDER BY statsval DESC)
FROM tbl
Try this using windows functions
Select distinct team,
statid,
max(statsval) OVER(PARTITION BY team,statid ORDER BY seasonid) as statid,
max(seasonid) OVER(PARTITION BY team,statid ORDER BY statid)
from tbl
Try this and look up the team id after the grouping is done:
;with tmp as
(
select team,
stat_id,
max(statsval) as statsval
from tbl
group by team,
statid
)
select tmp.*,
tbl.seasonid
from tmp join tbl
on tmp.team = tbl.team and tmp.statid = tbl.stat_id;
If you want the complete row, you can simply use a correlated subquery:
Select t.*
from tbl t
where t.season = (select max(t2.season)
from tbl t2
where t2.team = t.team and t2.statsval = t.statsval
);
With an index on tbl(team, statsval, season), this probably has as good as or better performance than other options.
A fun method that has worse performance (even with the index) is:
select top (1) with ties t.*
from tbl t
order by row_number() over (partition by team, statsval order by season desc);

Need to sum all Most Recent Rows from each Store that have ItemID

I have a table with, among other things, these columns: DateTransferred, ComputedQuantity, StoreID, ItemID
I have two goals. My simpler goal is to write a query where I feel in the ItemID and it sums up the ComputedQuantity where it matches that ItemID, only using the most recent DateTransferred for each StoreID. So with the following example data:
DateTransferred | StoreID | ItemID | ComputedQuantity
11/10/17 | 1 | 1 | 3 <
10/10/17 | 1 | 1 | 4
09/10/17 | 2 | 1 | 9 <
08/10/17 | 3 | 1 | 1 <
07/10/17 | 3 | 1 | 10
I would want it to pull every row with < next to it, as that's the most recent Date for that StoreID, and sum up to 13
My more complicated goal is that I would like to include the above-calculated value into a 'join' where I'm dealing with the Item table, so that I can pull all the items and join them with a new column which has the summed up ComputedQuantity
This is on SQL Server 10 on Windows Server 2008, if that matters
One simple method uses a correlated subquery:
select t.*
from t
where t. DateTransferred = (select max(t2.DateTransferred)
from t t2
where t2.storeid = t.storeid
);
Another even simpler method uses window functions:
select t.*
from (select t.*,
row_number() over (partition by storeid order by DateTransferred desc) as seqnum
from t
) t
where seqnum = 1;
In either case, you can add a where clause to the subquery if you want the most recent date on or before some given date (say a year ago).
Also, these both assume that your data has no future dates. If so, then add where DateTransferred < getdate().
The final statement which sums the ComputedQuantities:
select ItemID, SUM(ComputedQuantity) Quantity
from (select t.*,
row_number() over (partition by StoreID, ItemID order by DateTransferred DESC) as seqnum
from [db].[dbo].[InventoryTransferLog] t
) t
where seqnum = 1 and ComputedQuantity > 0
GROUP BY ItemID
ORDER BY ItemID
I decided not to sum values < 0

SQl - select columns and group them

I have a table like this below and i need the result to be like this when i run the query
Results
title | count
----------------
foo | 3
bar | 2
Table
customer_id | title
-------------------
55 | foo
22 | foo
55 | bar <-- duplicate
23 | bar
55 | bar <-- duplicate
23 | foo
UPDATE Thank you all for the quick response!
The trick is to count the distinct customer ids, so you won't count the double Foo for customer 55.
If you need to, you can order the results by that count too, or you can just leave out the order by clause.
select
title,
count(DISTINCT customerid) as `count`
from
yourTable
group by
title
order by
`count` desc
It's as easy as this:
select A.title, count(*) as count -- an usual count(*)
from
(select distinct customer_id, title -- this query removes the duplicates
from table) A
group by A.title -- an usual group by
For SQL Server you can do it like this.
select
t.title ,
count(*)
from your_table as t
group by
t.title
order by count(*) DESC