How to count how many times a specific value appeared on each columns and group by range - sql

I'm new on postgres and I have a question:
I have a table with 100 columns. I need to count the values from each columns and count how many times they appeared, so I can group then based on the range that they fit
I have a table like this(100 columns)
+------+------+------+------+------+---------+--------+
| Name | PRB0 | PRB1 | PRB2 | PRB3 | ....... | PRB100 |
+------+------+------+------+------+---------+--------+
| A | 15 | 6 | 47 | 54 | ..... | 8 |
| B | 25 | 22 | 84 | 86 | ..... | 76 |
| C | 57 | 57 | 96 | 38 | ..... | 28 |
+------+------+------+------+------+---------+--------+
And need the output to be something like this
+------+---------------+----------------+----------------+----------------+-----+-----------------+--+
| Name | Count 0 to 20 | Count 21 to 40 | Count 41 to 60 | Count 61 to 70 | ... | Count 81 to 100 | |
+------+---------------+----------------+----------------+----------------+-----+-----------------+--+
| A | 5 | 46 | 87 | 34 | ... | 98 | |
| B | 5 | 2 | 34 | 56 | ... | 36 | |
| C | 7 | 17 | 56 | 78 | ... | 88 | |
+------+---------------+----------------+----------------+----------------+-----+-----------------+--+
For Name A we have:
5 times the number between 0 and 20 apeared
46 times the number between 21 and 40 appeared
86 times the number between 41 and 60 appeared
Basicaly I need something like the function COUNTIFS that we have on Excel. On excel we just need to especify the range of columns and the condition.

You could unpivot with a lateral join, then aggregate:
select
name,
count(*) filter(where prb between 0 and 20) cnt_00_20,
count(*) filter(where prb between 21 and 50) cnt_21_20,
...,
count(*) filter(where prb between 81 and 100) cnt_81_100
from mytable t
cross join lateral (values(t.prb0), (t.prb1), ..., (t.prb100)) p(prb)
group by name
Note, however, that this still requires you to enumerate all the columns in the values() table constructor. If you want something fully dynamic, you can use json instead. The idea is to turn each record to a json object using to_jsonb(), then to rows with jsonb_each(); you can then do conditional aggregation.
select
name,
count(*) filter(where prb::int between 0 and 20) cnt_00_20,
count(*) filter(where prb::int between 21 and 50) cnt_21_20,
...,
count(*) filter(where prb::int between 81 and 100) cnt_81_100
from mytable t
cross join lateral to_jsonb(t) j(js)
cross join lateral jsonb_each( j.js - 'name') r(col, prb)
group by name

Related

Dynamic intersections between groups based on relation

I have 2 tables:
product_facet_values_facet_value
+-----------+--------------+
| productId | facetValueId |
+-----------+--------------+
| 6 | 1 |
| 6 | 34 |
| 7 | 39 |
| 8 | 34 |
| 8 | 1 |
| 8 | 11 |
| 9 | 1 |
| 9 | 39 |
+-----------+--------------+
facet_value
+--------------+---------+
| facetValueId | facetId |
+--------------+---------+
| 1 | 2 |
| 34 | 6 |
| 39 | 2 |
| 44 | 2 |
| 56 | 11 |
+--------------+---------+
I need to be able to get all productIds with those facetValueIds I ask for but with one extra step - I need an intersection between facetValueId groups based on same facetId.
For example I want to get all product ids with facetValueId 1, 34, 39 and result of this query should be same as I would get with the following query:
select "productId"
from "product_facet_values_facet_value"
where "facetValueId" in (1, 39)
INTERSECT
select "productId"
from "product_facet_values_facet_value"
where "facetValueId" in (34)
I wrote this query based on: facetValueIds 1 or 39 has same "facetId"=2, facetValueId 34 has "facetId"=6.
I need a query that would result in same result without having it to group it manually. If for example next time I ask for all products that have facetValueIds 1, 34, 39, 56 the result of such dynamic query should be same as if I would write 3 INTERSECTIONs between IN (1, 39) & IN(34) & IN(56) like:
select "productId"
from "product_facet_values_facet_value"
where "facetValueId" in (1, 39)
INTERSECT
select "productId"
from "product_facet_values_facet_value"
where "facetValueId" in (34)
INTERSECT
select "productId"
from "product_facet_values_facet_value"
where "facetValueId" in (56)
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=d06344b4a68c7b97fc1fad46c7437894
This is the same method as #a_horse_with_no_name used, but generalised very slightly.
WITH
targets AS
(
SELECT * FROM facet_value WHERE facetId IN (2, 6)
)
SELECT
map.productId
FROM
product_facet_values_facet_value AS map
INNER JOIN
targets AS tgt
ON tgt.facetValueId = map.facetValueId
GROUP BY
map.productId
HAVING
COUNT(DISTINCT tgt.facetId) = (SELECT COUNT(DISTINCT facetId) FROM targets)

Distribute sequential SQL results evenly based on count

I have SQL results that I need to break into item ranges and the count distributed evenly across a number of tasks. What is a good way to do this?
My data looks like this.
+------+-------+----------+
| Item | Count | ItmGroup |
+------+-------+----------+
| 1A | 100 | 1 |
| 1B | 25 | 1 |
| 1C | 2 | 1 |
| 1D | 6 | 1 |
| 2A | 88 | 2 |
| 2B | 10 | 2 |
| 2C | 122 | 2 |
| 2D | 12 | 2 |
| 3A | 4 | 3 |
| 3B | 103 | 3 |
| 3C | 1 | 3 |
| 3D | 22 | 3 |
| 4A | 55 | 4 |
| 4B | 42 | 4 |
| 4C | 100 | 4 |
| 4D | 1 | 4 |
+------+-------+----------+
Item = the item code.
Count = this context it is determining the popularity of the item. This can be used to RANK items if need be.
ItmGroup - this is a parent value for the Itm column. Item is contained in a Group.
What differentiates this from other similar questions I'veviewed is that the ranges I need to determine cannot be taken out of the order they show in this table. We can do Item Range from A1 to B3, in other words, they can cross over ItmGroups, but they must remain in alphanumeric order by Item.
The expected result would be item ranges that evenly distribute the total count.
+------+-------+----------+
| FrItem | ToItem | TotCount|
+------+-------+----------+
| 1A | 2D | 134 |
| 3A | 3D | 130 |
(etc)
Provided you've happy with a rough estimate, this will split the data in to two groups.
The first group will always have as many records as possible, but no more than half of the total count (and group 2 will have the rest).
WITH
cumulative AS
(
SELECT
*,
SUM([Count]) OVER (ORDER BY Item) AS cumulativeCount,
SUM([Count]) OVER () AS totalCount
FROM
yourData
)
SELECT
MIN(item) AS frItem,
MAX(item) AS toItem,
SUM([Count]) AS TotCount
FROM
cumulative
GROUP BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
ORDER BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
To split the data in to 5 portions, it's similar...
GROUP BY
CASE WHEN cumulativeCount <= totalCount * 1/5 THEN 0
WHEN cumulativeCount <= totalCount * 2/5 THEN 1
WHEN cumulativeCount <= totalCount * 3/5 THEN 2
WHEN cumulativeCount <= totalCount * 4/5 THEN 3
ELSE 4 END
Depending on your data this isn't necessarily ideal
Item | Count GroupAsDefinedAbove IdealGroup
------+-------
1A | 4 1 1
2A | 5 2 1
3A | 8 2 2
If you want something that can get the two groups as close in size as possible, that's a lot more complex.
Same as the accepted answer, except declaring a batch number and an addition to the select statement in the WITH cumulativeCte to prevent a remainder.
DECLARE #BatchCount NUMERIC(4,2) = 5.00;
WITH
cumulativeCte AS
(
SELECT
*,
SUM(r.[Count]) OVER (ORDER BY Item) AS cumulativeCount,
SUM(r.[Count]) OVER () AS totalCount
,CEILING(SUM(r.[Count]) OVER (ORDER BY IM.MMITNO ASC) / (SUM(r.[Count]) OVER () / #BatchCount)) AS BatchNo
FROM
records r
)
SELECT
MIN(c.Item) AS frItem,
MAX(c.Item) AS toItem,
SUM(c.[Count]) AS TotCount,
c.BatchNo
FROM
cumulativeCte c
GROUP BY
c.BatchNo
ORDER BY
c.BatchNo

Query to get the count of data for particular customer with all other data from table

My table structure is as follows:
group_id | cust_id | ticket_num
------------------------------
60 | 12 | 1
60 | 12 | 2
60 | 12 | 3
60 | 12 | 4
60 | 30 | 5
60 | 30 | 6
60 | 31 | 7
60 | 31 | 8
65 | 02 | 1
I want to fetch all the data for group_id=60 and find the count of ticket_num for each customer in that group. My output should be like this:
cust_id | ticket_count | ticket_num
------------------------------
12 | 4 | 1
12 | | 2
12 | | 3
12 | | 4
30 | 2 | 5
30 | | 6
31 | 2 | 7
31 | | 8
I tried this query:
SELECT gd.cust_id, Count(gd.cust_id),gd.ticket_num
FROM Group_details gd
WHERE gd.group_id = 65
GROUP BY gd.cust_id;
But this query is not working.
You appear to want the ANSI/ISO standard row_number() functions and count() as a window function:
select gd.cust_id, count(*) over (partition by gd.cust_id) as num_tickets,
row_number() over (order by gd.cust_id) as ticket_seqnum
from group_details gd
where gd.group_id = 60;
use aggregate and subquery
select t2.*,t1.ticket_num from Group_details t1
inner join
(
SELECT gd.cust_id, Count(gd.ticket_num) as ticket_count
FROM Group_details gd where gd.group_id = 60
GROUP BY gd.cust_id
) t2 on t1.cust_id=t2.cust_id
http://sqlfiddle.com/#!9/dd718b/1

How to use previous row's column's value for calculating the next row's column's value

I have a table
Id | Aisle | OddEven | Bay | Size | Y-Axis
3 | A1 | Even | 14 | 10 | 100
1 | A1 | Even | 16 | 10 |
6 | A1 | Even | 20 | 10 |
12 | A1 | Even | 26 | 5 | 150
10 | A1 | Even | 28 | 5 |
11 | A1 | Even | 32 | 5 |
2 | A1 | Odd | 13 | 10 | 100
5 | A1 | Odd | 17 | 10 |
4 | A1 | Odd | 19 | 10 |
9 | A1 | Odd | 23 | 5 | 150
7 | A1 | Odd | 25 | 5 |
8 | A1 | Odd | 29 | 5 |
want to look like this
Id | Aisle | OddEven | Bay | Size | Y-Axis
1 | A1 | Even | 14 | 10 | 100
2 | A1 | Even | 16 | 10 | 110
3 | A1 | Even | 20 | 10 | 120
4 | A1 | Even | 26 | 5 | 150
5 | A1 | Even | 28 | 5 | 155
6 | A1 | Even | 32 | 5 | 160
7 | A1 | Odd | 13 | 10 | 100
8 | A1 | Odd | 17 | 10 | 110
9 | A1 | Odd | 19 | 10 | 120
10 | A1 | Odd | 23 | 5 | 150
11 | A1 | Odd | 25 | 5 | 155
12 | A1 | Odd | 29 | 5 | 160
I need a select query and update query. What its doing is there are already some Y-Axis Number been filled (at the start of the Odd/Even) then I need to take the previous row's Y-Axis column's value and adds to the current rows's size which = to current Y-Axis. Needs to keep doing it until it finds another Y-Axis has the value it skips the calculation and next row is using that number.
My thinking process is this:
Id will definitely be used, however, the Id is not sequence as shown my example
so I need to have
ROW_Number OVER (PARTITION BY Aisle,OddEven,Bay Order BY Aisle,OddEven,Bay)
Then some kind of JOIN the same table but the ON is T1.RN = T2.RN - 1
Where I am stuck is but the first row has not previous value it will try to update that value.
Anyone have an idea for SQL Query 2008 for Select and Update will be greatly appreciated! Thanks.
You seem to want a cumulative sum. This would be easier in SQL Server 2012+. You can do this in SQL Server 2008 using outer apply:
select t.*, cume_value
from t outer apply
(select sum(size) + sum(yaxis) as cume_value
from t t2
where t2.aisle = t.aisle and t2.oddeven = t.oddeven and
t2.bay < t.bay
) t2;
A little more difficult on 2008, but I think this is what you are looking for
Declare #Table table (Id int,Aisle varchar(25),OddEven varchar(25),Bay int,Size int,[Y-Axis] int)
Insert Into #Table values
(3,'A1','Even',14,10 ,100),
(1,'A1','Even',16,10 ,0),
(6,'A1','Even',20,10 ,0),
(12,'A1','Even',26,5,150),
(10,'A1','Even',28,5,0),
(11,'A1','Even',32,5,0),
(2,'A1','Odd',13,10 ,100),
(5,'A1','Odd',17,10 ,0),
(4,'A1','Odd',19,10 ,0),
(9,'A1','Odd',23,5,150),
(7,'A1','Odd',25,5,0),
(8,'A1','Odd',29,5,0)
;with cteBase as (
Select *
,IDNew=Row_Number() over (Order By Aisle,Bay)
,RowNr=Row_Number() over (Order By Aisle,OddEven,Bay)
From #Table
)
, cteGroup as (Select TmpRowNr=RowNr,GrpNr=Row_Number() over (Order By RowNr) from cteBase where [Y-Axis]>0)
, cteFinal as (
Select A.*
,GrpNr = (Select max(GrpNr) from cteGroup Where TmpRowNr<=RowNr)
From cteBase A
)
Select ID=Row_Number() over (Order By A.OddEven,A.Bay)
,A.Aisle
,A.OddEven
,A.Bay
,A.Size
,[Y-Axis] = Sum(case when B.[Y-Axis]>0 then B.[Y-Axis] else B.Size end)
From cteFinal A
Join cteFinal B on (B.RowNr<=A.RowNr and A.GrpNr=B.GrpNr)
Group By
A.IDNew
,A.Aisle
,A.OddEven
,A.Bay
,A.Size
Order By A.OddEven,A.Bay
Returns
ID Aisle OddEven Bay Size Y-Axis
1 A1 Even 14 10 100
2 A1 Even 16 10 110
3 A1 Even 20 10 120
4 A1 Even 26 5 150
5 A1 Even 28 5 155
6 A1 Even 32 5 160
7 A1 Odd 13 10 100
8 A1 Odd 17 10 110
9 A1 Odd 19 10 120
10 A1 Odd 23 5 150
11 A1 Odd 25 5 155
12 A1 Odd 29 5 160
I gotta leave my computer so update query should be easy to move on from here.
Below is the select query;
select row_number() over (order by oddeven,bay) id,
Aisle,
OddEven,
Bay,
Size,
max(ISNULL([Y-Axis],0)) over (partition by Aisle, OddEven,Size order by bay)
+ sum(CASE WHEN [Y-Axis] is null THEN Size ELSE 0 END) over (partition by Aisle,OddEven,size order by Bay) as [Y-Axis]
from oddseven
order by id

Return dynamic columns by joining results of a stored function

I have a stored function called Fnc_MyFunc(#myDate). it takes a date parameter and return a table like this:
user_id | count_laps
--------+-----------
1 | 85
2 | 37
5 | 55
12 | 48
i want to execute this for many dates (date interval).
with my function i whant a result like this: (laps per day for all users)
user_id | [2015-10-01] | [2015-10-02] | [2015-10-03] | ....
--------+--------------+--------------+--------------+--------------
1 | 85 | 2 | 66 | ....
2 | 37 | 58 | 85 | ....
5 | 55 | 33 | 75 | ....
12 | 48 | 44 | 55 | ....
This query should do what you want:
SELECT
u.userId,
d1.count_laps as [2015-10-01]
d2.count_laps as [2015-10-02],
d3.count_laps as [2015-10-03]
FROM
TableOrQueryWithDistinctUserIds u
LEFT JOIN Fnc_MyFunc('2015-10-01') d1 ON u.userId = d1.userId
LEFT JOIN Fnc_MyFunc('2015-10-02') d2 ON u.userId = d2.userId
LEFT JOIN Fnc_MyFunc('2015-10-03') d3 ON u.userId = d2.userId
You can then make it dynamic if necessary.