Adding subquery to a grid query

Adding subquery to a grid query - sql

Following from this question, I have another query that I need to subtract a value of 10 from for all negative numbers in the data. Sadly, I'm just not sure how to implement the same subquery as is given in the previous question.
The query in question is
SELECT 10 * (c.customer_x / 10), 10 * (c.customer_y / 10),
COUNT(*) as num_orders,
SUM(o.order_total)
FROM t_customer c
JOIN t_order o
ON c.customer_id = o.customer_id
GROUP BY c.customer_x / 10, c.customer_y / 10
ORDER BY SUM(o.order_total) DESC;
which calculates the order totals from each grid square.

Your original query doesn't change much in the query below. The only difference is the new join and one more term added to the SELECT list:
SELECT 10 * (c.customer_x / 10) AS col1,
10 * (c.customer_y / 10) AS col2,
COUNT(*) AS num_orders,
SUM(o.order_total) AS order_total_sum
FROM
(
SELECT customer_id,
CASE WHEN customer_x < 0 THEN customer_x - 10 ELSE customer_x END AS customer_x,
CASE WHEN customer_y < 0 THEN customer_y - 10 ELSE customer_y END AS customer_y
FROM t_customer
) c
INNER JOIN t_order o
ON c.customer_id = o.customer_id
GROUP BY c.customer_x / 10,
c.customer_y / 10
ORDER BY SUM(o.order_total) DESC
Note that you could solve this without the use of the subquery which I have used. However, the subquery makes it much more readable, and lets us compute the adjusted customer_x and customer_y values neatly.

Related

How add up values from multiple SQL columns based on occurrances

I need select values from a table and returns the total hours for all categories and their occurrences. The challenge is that there are different totals for each occurrence.
My query:
SELECT c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences
FROM dbo.Categories sc
INNER JOIN dbo.Categories c
ON sc.CategoryID = c.CategoryID
INNER JOIN dbo.OrderHistory oh
ON sc.GONumber = oh.OrderNumber
AND sc.Item = oh.ItemNumber
WHERE sc.BusinessGroupID = 1
AND oh.OrderNumber = 500
AND oh.ItemNumber = '100'
GROUP BY c.Category, c.HrsFirstOccur, c.HrsAddlOccur
returns the following results:
Category
HrsFirstOccur
HrsAddlOccur
Occurrences
Inertia
24
16
2
Lights
1
0.5
4
Labor
10
0
1
The total is calculated based on the number of occurrences. The first one is totaled then for each additional occurrence, the HrsAddlOccur is used.
My final result should be (24 + 16) + (1 + 0.5 + 0.5 + 0.5) + 10 for a grand total of 52.5.
How do I loop and process the results to total this up?

The total is calculated based on the number of occurrences. The first one is totaled then for each additional occurrence, the HrsAddlOccur is used.
SQL databases understand arithmetic. You can perform the computation on each row. As I understand, the logic you want is:
SELECT
c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences,
c.HrsFirstOccur + ( COUNT(*) - 1 ) * HrsAddlOccur As Total
FROM ... < rest of your query > ..
Later on you can aggregate the whole resultset to get the grand total:
SELECT SUM(Total) GrandTotal
FROM (
... < above query > ..
) t

you can sum them simply up
WITH CTE as(SELECT c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences
FROM dbo.Categories sc
INNER JOIN dbo.Categories c ON sc.CategoryID = c.CategoryID
INNER JOIN dbo.OrderHistory oh ON sc.GONumber = oh.OrderNumber
AND sc.Item = oh.ItemNumber
WHERE sc.BusinessGroupID = 1
AND oh.OrderNumber = 500
AND oh.ItemNumber = '100')
SELECT SUM(HrsFirstOccur + (CAST((Occurrences -1) AS DECIMAL(8,2)) * HrsAddlOccur)) as total FROM CTE
it would do it like the example
CREATE TABLE CTE
([Category] varchar(7), [HrsFirstOccur] int, [HrsAddlOccur] DECIMAL(8,2), [Occurrences] int)
;
INSERT INTO CTE
([Category], [HrsFirstOccur], [HrsAddlOccur], [Occurrences])
VALUES
('Inertia', 24, 16, 2),
('Lights', 1, 0.5, 4),
('Labor', 10, 0, 1)
;
3 rows affected
SELECT SUM(HrsFirstOccur + (CAST((Occurrences -1) AS DECIMAL(8,2)) * HrsAddlOccur)) as total
FROM CTE
total
52.5000
fiddle

SQL - Group values by range

I have following query:
SELECT
polutionmm2 AS metric,
sum(cnt) as value
FROM polutiondistributionstatistic as p inner join crates as c on p.crateid = c.id
WHERE
c.name = '154'
and to_timestamp(startts) >= '2021/01/20 00:00:00' group by polutionmm2
this query returns these values:
"metric","value"
50,580
100,8262
150,1548
200,6358
250,869
300,3780
350,505
400,2248
450,318
500,1674
550,312
600,7420
650,1304
700,2445
750,486
800,985
850,139
900,661
950,99
1000,550
I would need to edit the query in a way that it groups them toghether in ranges of 100, starting from 0. So everything that has a metric value between 0 and 99 should be one row, and the value the sum of the rows... like this:
"metric","value"
0,580
100,9810
200,7227
300,4285
400,2556
500,1986
600,8724
700,2931
800,1124
900,760
1000,550
The query will run over about 500.000 rows.. Can this be done via query? Is it efficient?
EDIT:
there can be up to 500 ranges, so an automatic way of grouping them would be great.

You can use generate_series() and a range type to generate the the ranges you want, e.g.:
select int4range(x.start, case when x.start = 1000 then null else x.start + 100 end, '[)') as range
from generate_series(0,1000,100) as x(start)
This generates the ranges [0,100), [100,200) and so on up until [1000,).
You can adjust the width and the number of ranges by using different parameters for generate_series() and adjusting the expression that evaluates the last range
This can be used in an outer join to aggregate the values per range:
with ranges as (
select int4range(x.start, case when x.start = 1000 then null else x.start + 100 end, '[)') as range
from generate_series(0,1000,100) as x(start)
)
select r.range as metric,
sum(t.value)
from ranges r
left join the_table t on r.range #> t.metric
group by range;
The expression r.range #> t.metric tests if the metric value falls into the (generated) range
Online example

You can create a Pseudo table with interval you like and join with that table.
I'll use recursive CTE for this case.
WITH RECURSIVE cte AS(
select 0 St, 99 Ed
UNION ALL
select St + 100, Ed + 100 from cte where St <= 1000
)
select cte.st as metric,sum(tb.value) as value from cte
inner join [tableName] tb --with OP query result
on tb.metric between cte.St and cte.Ed
group by cte.st
order by st
here is DB<>fiddle with some pseudo data.

use conditional aggregation
SELECT
case when polutionmm2>=0 and polutionmm2<100 then '100'
when polutionmm2>=100 and polutionmm2<200 then '200'
........
when polutionmm2>=900 and polutionmm2<1000 then '1000'
end AS metric,
sum(cnt) as value
FROM polutiondistributionstatistic as p inner join crates as c on p.crateid = c.id
WHERE
c.name = '154'
and to_timestamp(startts) >= '2021/01/20 00:00:00'
group by case when polutionmm2>=0 and polutionmm2<100 then '100'
when polutionmm2>=100 and polutionmm2<200 then '200'
........
when polutionmm2>=900 and polutionmm2<1000 then '1000'
end

Select random sample of N rows from Oracle SQL query result

I want to reduce the number of rows exported from a query result. I have had no luck adapting the accepted solution posted on this thread.
My query looks as follows:
select
round((to_date('2019-12-31') - date_birth) / 365, 0) as age
from
personal_info a
where
exists
(
select person_id b from credit_info where credit_type = 'C' and a.person_id = b.person_id
)
;
This query returns way more rows than I need, so I was wondering if there's a way to use sample() to select a fixed number of rows (not a percentage) from however many rows result from this query.

You can sample your data by ordering randomly and then fetching first N rows.
DBMS_RANDOM.RANDOM
select round((to_date('2019-12-31') - date_birth) / 365, 0) as age
From personal_info a
where exists ( select person_id b from credit_info where credit_type = 'C' and a.person_id = b.person_id )
Order by DBMS_RANDOM.RANDOM
Fetch first 250 Rows
Edit: for oracle 11g and prior
Select * from (
select round((to_date('2019-12-31') - date_birth) / 365, 0) as age
From personal_info a
where exists ( select person_id b from credit_info where credit_type = 'C' and a.person_id = b.person_id )
Order by DBMS_RANDOM.RANDOM
)
Where rownum< 250

You can use fetch first to return a fixed number of rows. Just add:
fetch first 100 rows
to the end of your query.
If you want these sampled in some fashion, you need to explain what type of sampling you want.

If you are using 12C, you can use the row limiting clause below
select
round((to_date('2019-12-31') - date_birth) / 365, 0) as age
from
personal_info a
where
exists
(
select person_id b from credit_info where credit_type = 'C' and a.person_id = b.person_id
)
FETCH NEXT 5 ROWS ONLY;
Instead of 5, you can use any number you want.

Out of range integer: infinity

So I'm trying to work through a problem thats a bit hard to explain and I can't expose any of the data I'm working with but what Im trying to get my head around is the error below when running the query below - I've renamed some of the tables / columns for sensitivity issues but the structure should be the same
"Error from Query Engine - Out of range for integer: Infinity"
WITH accounts AS (
SELECT t.user_id
FROM table_a t
WHERE t.type like '%Something%'
),
CTE AS (
SELECT
st.x_user_id,
ad.name as client_name,
sum(case when st.score_type = 'Agility' then st.score_value else 0 end) as score,
st.obs_date,
ROW_NUMBER() OVER (PARTITION BY st.x_user_id,ad.name ORDER BY st.obs_date) AS rn
FROM client_scores st
LEFT JOIN account_details ad on ad.client_id = st.x_user_id
INNER JOIN accounts on st.x_user_id = accounts.user_id
--WHERE st.x_user_id IN (101011115,101012219)
WHERE st.obs_date >= '2020-05-18'
group by 1,2,4
)
SELECT
c1.x_user_id,
c1.client_name,
c1.score,
c1.obs_date,
CAST(COALESCE (((c1.score - c2.score) * 1.0 / c2.score) * 100, 0) AS INT) AS score_diff
FROM CTE c1
LEFT JOIN CTE c2 on c1.x_user_id = c2.x_user_id and c1.client_name = c2.client_name and c1.rn = c2.rn +2
I know the query works for sure because when I get rid of the first CTE and hard code 2 id's into a where clause i commented out it returns the data I want. But I also need it to run based on the 1st CTE which has ~5k unique id's
Here is a sample output if i try with 2 id's:
Based on the above number of row returned per id I would expect it should return 5000 * 3 rows = 150000.
What could be causing the out of range for integer error?

This line is likely your problem:
CAST(COALESCE (((c1.score - c2.score) * 1.0 / c2.score) * 100, 0) AS INT) AS score_diff
When the value of c2.score is 0, 1.0/c2.score will be infinity and will not fit into an integer type that you’re trying to cast it into.
The reason it’s working for the two users in your example is that they don’t have a 0 value for c2.score.
You might be able to fix this by changing to:
CAST(COALESCE (((c1.score - c2.score) * 1.0 / NULLIF(c2.score, 0)) * 100, 0) AS INT) AS score_diff

Cumulative Percentage Categorization Query

I've got a query that's been driving me up the wall. The t-sql query is as follows but with about a thousand records in the source table, it's taking forever and a day to run. Is there any faster way of accomplishing the same task anyone might think of:
SELECT *, ROUND((SELECT SUM(PartTotal)
FROM PartSalesRankings
WHERE Item_Rank <= sub.Item_Rank) /
(SELECT SUM(PartTotal)
FROM PartSalesRankings) * 100, 2) as Cum_PC_Of_Total
FROM PartSalesRankings As sub
I'm trying to classify my inventory into A,B, and C categories based on percentage of cost, but it needs to be a cumulative percentage of cost, ie. 'A' parts make up 80% of my cost, 'B' parts make up the next 15%, and 'C' parts are the last 5%. There's obviously more to the sql statement than what I included, but the code I posted is the bottle-neck.
Any help would be greatly appreciated!
Aj

Try this:
;WITH SumPartTotal AS
(SELECT SUM(PartTotal) as SumPartTotal, Item_Rank
FROM PartSalesRankings
GROUP BY Item_Rank
),
CumSumPartTotal AS
(SELECT SUM(Sub.SumPartTotal) as CumSumPartTotal, SPT.Item_Rank
FROM SumPartTotal as SPT
INNER JOIN SumPartTotal as Sub ON
SPT.Item_Rank >= Sub.Item_Rank
GROUP BY SPT.Item_Rank
),
SumTotal AS
(SELECT SUM(PartTotal) as SumTotal
FROM PartSalesRankings
)
SELECT *,
ROUND((CumSumPartTotal.CumSumPartTotal * 100.0) / SumTotal.SumTotal, 2) as Cum_PC_Of_Total
FROM PartSalesRankings As sub
INNER JOIN CumSumPartTotal ON
sub.Item_Rank = CumSumPartTotal.Item_Rank
INNER JOIN SumTotal ON
sub.Item_Rank = SumTotal.Item_Rank
It should give your query a speed boost.

Sorry for the delay in response, but I've been up to my neck in other stuff, but after a great deal of trial and error I've found the following to be the fastest method:
Select companypartnumber
, (PartTotal + IsNull(Cum_Lower_Ranks, 0) ) / Sum(PartTotal) over() * 100 as Cum_PC_Of_Total
from PartSalesRankings PSRMain
Left join
(
Select PSRTop.Item_Rank, Sum(PSRBelow.PartTotal) as Cum_Lower_Ranks
from partSalesRankings PSRTop
Left join PartSalesRankings PSRBelow on PSRBelow.Item_Rank < PSRTop.Item_Rank
Group by PSRTop.Item_Rank
) as PSRLowerCums on PSRLowerCums.Item_Rank = PSRMain.Item_Rank
) sub2

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Adding subquery to a grid query - sql

Related

How add up values from multiple SQL columns based on occurrances

SQL - Group values by range

Select random sample of N rows from Oracle SQL query result

Out of range integer: infinity

Cumulative Percentage Categorization Query

Categories

Resources