SQL Server 2016 incorrect query plan estimate despite updated statistic - sql

I'm in the middle of optimizing a query and notice that it becomes really slow because it estimated the number of rows to be 16.6 and the actual number of rows being returned is 565824. I updated the statistic, dropped and recreated but it still gives the incorrect estimate. This is for SQL Server 2016, any help is appreciated.
SQL:
select cd_key
from dbo.CAMPDIV
where cd_camp = 'a'
and CD_CAMPYR = '2018'
option (recompile)
Histogram for nonclustered index (cd_campyr)
All Density Average Length Columns
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0.02040816 4 CD_CAMPYR
7.412665E-08 8 CD_CAMPYR, CD_ID
7.184833E-08 18 CD_CAMPYR, CD_ID, CD_CAMP
Histogram Steps
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 792181 0 1
1979 0 230 0 1
1980 0 332 0 1
1981 0 604 0 1
1982 0 622 0 1
1983 0 330 0 1
1984 0 1762 0 1
1985 0 868 0 1
1986 0 551 0 1
1987 0 190 0 1
1988 0 352 0 1
1989 0 519 0 1
1990 0 38829 0 1
1991 0 439486 0 1
1992 0 366357 0 1
1993 0 375469 0 1
1994 0 369176 0 1
1995 0 367691 0 1
1996 0 376979 0 1
1997 0 388239 0 1
1998 0 391408 0 1
1999 0 402551 0 1
2000 0 413392 0 1
2001 0 422470 0 1
2002 0 461895 0 1
2003 0 458726 0 1
2004 0 459876 0 1
2005 0 473357 0 1
2006 0 464213 0 1
2007 0 472373 0 1
2008 0 457623 0 1
2009 0 462268 0 1
2010 0 465633 0 1
2011 0 470338 0 1
2012 0 472091 0 1
2013 0 481586 0 1
2014 0 484236 0 1
2015 0 492460 0 1
2016 0 514569 0 1
2017 0 551739 0 1
2018 0 571969 0 1
2019 0 552550 0 1
2020 0 54 0 1
2021 0 33 0 1
2022 0 21 0 1
2023 0 8 0 1
2025 1 1 1 1
2099 0 1 0 1

It is a bit strange what you tell but in any case... a covering index may help or even produce a drastic increase.
Please, try to create the index:
CREATE INDEX IX_CampDiv_CD_Camp_CD_CampYR ON dbo.CAMPDIV (cd_camp, CD_CAMPYR )
INCLUDE (cd_key)
At least will prevent the Nested Loop what improves the plan.
Please, share the results.

Related

how to do sum with multiple joins in PostgreSQL?

I know that my question would be duplicated but I really don't know how to created sql which return results of sum with multiple join.
Tables I have
result_summary
num_bin id_summary count_bin
3 172 0
4 172 0
5 172 0
6 172 0
7 172 0
8 172 0
1 174 1
2 174 0
3 174 0
4 174 0
5 174 0
6 174 0
7 174 0
8 174 0
1 175 0
summary_assembly
num_lot id_machine sabun date_work date_write id_product shift count_total count_fail count_good id_summary id_operation
adfe 1 21312 2020-11-25 2020-11-25 1 A 10 2 8 170 2000
adfe 1 21312 2020-11-25 2020-11-25 1 A 1000 1 999 171 2000
adfe 1 21312 2020-11-25 2020-11-25 2 A 100 1 99 172 2000
333 1 21312 2020-12-06 2020-12-06 1 A 10 2 8 500 2000
333 1 21312 2020-11-26 2020-11-26 1 A 10000 1 9999 174 2000
333 1 21312 2020-11-26. 2020-11-26 1 A 100 0 100 175 2000
333 1 21312 2020-12-06 2020-12-06 1 A 10 2 8 503 2000
333 1 21312 2020-12-07 2020-12-07 1 A 10 2 8 651 2000
333 1 21312 2020-12-02 2020-12-02 1 A 10 2 8 178 2000
employees
sabun name_emp
3532 Kim
12345 JS
4444 Gilsoo
21312 Wayn Hahn
123 Lee too
333 JD
info_product
id_product name_product
1 typeA
2 typeB
machine
id_machine id_operation name_machine
1 2000 name1
2 2000 name2
3 2000 name3
4 3000 name1
5 3000 name2
6 3000 name3
7 4000 name1
8 4000 name2
query
select S.id_summary, I.name_product, M.name_machine,
E.name_emp, S.sabun, S.date_work,
S.shift, S.num_lot, S.count_total,
S.count_good, S.count_fail,
sum(case num_bin when '1' then count_bin else 0 end) as bin1,
sum(case num_bin when '2' then count_bin else 0 end) as bin2,
sum(case num_bin when '3' then count_bin else 0 end) as bin3,
sum(case num_bin when '4' then count_bin else 0 end) as bin4,
sum(case num_bin when '5' then count_bin else 0 end) as bin5,
sum(case num_bin when '6' then count_bin else 0 end) as bin6,
sum(case num_bin when '7' then count_bin else 0 end) as bin7,
sum(case num_bin when '8' then count_bin else 0 end) as bin8
from result_assembly as R
join summary_assembly as S on R.id_summary = S.id_summary
join employees as E on S.sabun = E.sabun
join info_product as I on S.id_product = I.id_product
join machine as M on S.id_machine = M.id_machine
where I.id_product = '1'
and E.sabun='21312'
and S.shift = 'A'
and S.date_work between '2020-11-10' and '2020-12-20'
group by S.id_summary, E.name_emp, S.num_lot,
I.name_product,M.name_machine
order by S.id_summary;
result
id_summary name_product name_machine name_emp sabun date_work shift num_lot count_total count_good count_fail bin1 bin2 bin3 bin4 bin5 bin6 bin7 bin8
170 TypeA name1 Kim 21312 2020-11-25 A adfe 10 8 2 1 1 0 0 0 0 0 0
171 TypeA name1 Kim 21312 2020-11-25 A adfe 1000 999 1 1 1 0 0 0 0 0 0
174 TypeA name1 Kim 21312 2020-11-26 A 333 10000 9999 1 1 1 0 0 0 0 0 0
175 TypeA name1 Kim 21312 2020-11-26 A 333 100 100 0 0 0 0 0 0 0 0 0
178 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
179 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
180 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
181 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
182 TypeA name2 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
186 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 1 1 0 0 0 0 0 0
193 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
194 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
195 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
196 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
197 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
198 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
199 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
200 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
expected output(when sum by num_lot)
num_lot count_total count_good count_fail bin1 bin2 bin3 bin4 bin5 bin6 bin7 bin8
adfe 323 300 23 22 1 0 0 0 0 0 0
333 4312 4300 12 10 2 0 0 0 0 0 0
All of them were modified from original one because they were non-English, so there would be typo.
Here now I need to sum by num_lot, name_product or sabun.
id_summary is unique.
Thanks
As expected in the comments: It seems like you simple need a subquery which groups your table by the column num_lot
SELECT
num_lot,
SUM(count_total),
SUM(count_good)
-- some more SUM()
FROM (
--<your query>
) s
GROUP BY num_lot
It was asked in the comments what the s stands for: A subquery needs an alias, an identifier. Because I didn't want to think about a better name, I just called the subselect s. It is the shortcut for AS s
It sounds like you want to use crosstab() -- https://www.postgresql.org/docs/current/tablefunc.html

SQL: Increment a row when value in another row changes

I have the following table:
Sequence Change
100 0
101 0
103 0
106 0
107 1
110 0
112 1
114 0
115 0
121 0
126 1
127 0
134 0
I need an additional column, Group, whose values increment based on the occurrence of 1 in Change. How is that done? I'm using Microsoft Server 2012.
Sequence Change Group
100 0 0
101 0 0
103 0 0
106 0 0
107 1 1
110 0 1
112 1 2
114 0 2
115 0 2
121 0 2
126 1 3
127 0 3
134 0 3
You want a cumulative sum:
select t.*, sum(change) over (order by sequence) as grp
from t;

proc sql statement to sum on values/rows that match a condition

I have a data table like below:
Table 1:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT
10 111 2009 . 100 .
110 120 2009 9 10 .
231 120 2009 0 20 10
222 120 2010 0 40 20
221 222 2009 102 10 30
321 222 2009 0 30 20
213 222 2009 0 10 20
432 321 2009 99 10 0
211 432 2009 111 20 10
212 432 2009 0 20 0
I want to sum over the DAYSBETVISIT column only when the pidDifference value is 0 for each PERSONID. So I wrote the following proc sql statement.
proc sql;
create table table5 as
(
select rowid, YEAR, PERSONID, pidDifference, TIMETOEVENT, DAYSBETVISIT,
SUM(CASE WHEN PIDDifference = 0 THEN DaysBetVisit ELSE 0 END)
from WORK.Table4_1
group by PERSONID,TIMETOEVENT, YEAR
);
quit;
However, the result I got was not summing the DAYSBETVISIT values in rows where PIDDifference = 0 within the same PERSONID. It just output the same value as was present in DAYSBETVISIT in that specific row.
Column that I NEED (sumdays) but don't get with above statement (showing the resultant column using above statement as OUT:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT sumdays OUT
10 111 2009 . 100 . 0 0
110 120 2009 9 10 . 0 0
231 120 2009 0 20 10 30 10
222 120 2010 0 40 20 30 20
221 222 2009 102 10 30 0 0
321 222 2009 0 30 20 40 20
213 222 2009 0 10 20 40 20
432 321 2009 99 10 0 0 0
211 432 2009 111 20 10 0 0
212 432 2009 0 20 0 0 0
I do not know what I am doing wrong.
I am using SAS EG Version 7.15, Base SAS version 9.4.
For your example data it looks like you just need to use two CASE statements. One to define which values to SUM() and another to define whether to report the SUM or not.
proc sql ;
select personid, piddifference, daysbetvisit, sumdays
, case when piddifference = 0
then sum(case when piddifference=0 then daysbetvisit else 0 end)
else 0 end as WANT
from expect
group by personid
;
quit;
Results
pid
PERSONID Difference DAYSBETVISIT sumdays WANT
--------------------------------------------------------
111 . . 0 0
120 0 10 30 30
120 0 20 30 30
120 9 . 0 0
222 0 20 40 40
222 0 20 40 40
222 102 30 0 0
321 99 0 0 0
432 0 0 0 0
432 111 10 0 0
SAS proc sql doesn't support window functions. I find the re-merging aggregations to be a bit difficult to use, except in the obvious cases. So, use a subquery or join and group by:
proc sql;
create table table5 as
select t.rowid, t.YEAR, t.PERSONID, t.pidDifference, t.TIMETOEVENT, t.DAYSBETVISIT,
tt.sum_DaysBetVisit
from WORK.Table4_1 t left join
(select personid, sum(DaysBetVisit) as sum_DaysBetVisit
from WORK.Table4_1
group by personid
having min(pidDifference) = max(pidDifference) and min(pidDifference) = 0
) tt
on tt.personid = t.personid;
Note: This doesn't handle NULL values for pidDifference. If that is a concern, you can add count(pidDifference) = count(*) to the having clause.

Postgres query for annual sales report by rep. grouped by month

I would like to create an annual sales report table by sales rep, showing all the twelve month. The data I have is more or less like in this example:
id | rep | date | price
----------------------------
1 1 2017-01-01 20
2 1 2017-01-20 44
3 2 2017-02-18 13
4 2 2017-03-08 12
5 2 2017-04-01 88
6 2 2017-09-05 67
7 3 2017-01-31 10
8 3 2017-06-01 74
The result I need would be like this:
Rep Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
----------------------------------------------------
1 64 0 0 0 0 0 0 0 0 0 0 0
2 0 13 12 88 0 0 0 0 67 0 0 0
3 10 0 0 0 0 74 0 0 0 0 0 0
What would be the most efficient way to write this query?
One way:
select rep,
sum(case when extract('month' from date) = 1 then price else 0 end ) as Jan,
sum(case when extract('month' from date) = 2 then price else 0 end ) as Feb
-- another months here
from t
group by rep
One way is to use windowed functions with FILTER:
SELECT DISTINCT
"rep",
COALESCE(SUM(price) FILTER (WHERE extract('month' from "date") = 1) OVER(PARTITION BY "rep"),0) AS Jan,
COALESCE(SUM(price) FILTER (WHERE extract('month' from "date") = 2) OVER(PARTITION BY "rep"),0) AS Feb
--....
FROM ta;
Rextester Demo
Warning!
You probably want to partition by YEAR too to avoid summing JAN 2017 and JAN 2018.

count number of records every month on group by clause in sql

I need to exceute a report every month where I have town, street name, t_number,t_date_time_issued where the t_number is the ticket issued on every town. I need to now how many tickets been issued every month in each year and what is the difference between the last year this month total issued tickets - present year this month total issued tickets . I tied the following but for no luck.
SELECT [Town],
[Site Name],
SUM(CASE datepart(month,t_date_time_issued) WHEN 1 THEN 1 ELSE 0 END) AS 'January',
SUM(CASE datepart(month,t_date_time_issued) WHEN 2 THEN 1 ELSE 0 END) AS 'February',
SUM(CASE datepart(month,t_date_time_issued) WHEN 3 THEN 1 ELSE 0 END) AS 'March',
SUM(CASE datepart(month,t_date_time_issued) WHEN 4 THEN 1 ELSE 0 END) AS 'April',
SUM(CASE datepart(month,t_date_time_issued) WHEN 5 THEN 1 ELSE 0 END) AS 'May',
SUM(CASE datepart(month,t_date_time_issued) WHEN 6 THEN 1 ELSE 0 END) AS 'June',
SUM(CASE datepart(month,t_date_time_issued) WHEN 7 THEN 1 ELSE 0 END) AS 'July',
SUM(CASE datepart(month,t_date_time_issued) WHEN 8 THEN 1 ELSE 0 END) AS 'August',
SUM(CASE datepart(month,t_date_time_issued) WHEN 9 THEN 1 ELSE 0 END) AS 'September',
SUM(CASE datepart(month,t_date_time_issued) WHEN 10 THEN 1 ELSE 0 END) AS 'October',
SUM(CASE datepart(month,t_date_time_issued) WHEN 11 THEN 1 ELSE 0 END) AS 'November',
SUM(CASE datepart(month,t_date_time_issued) WHEN 12 THEN 1 ELSE 0 END) AS 'December',
SUM(CASE datepart(year,t_date_time_issued) WHEN 2012 THEN 1 ELSE 0 END) AS 'TOTAL'
FROM
[VCS].[dbo].[manual issued sites] as a
inner join icps.dbo.tickets as b on
b.[t_street_name]COLLATE DATABASE_DEFAULT = a.[Town] COLLATE DATABASE_DEFAULT and
b.[t_zone_name]COLLATE DATABASE_DEFAULT = a.[Site Name] COLLATE DATABASE_DEFAULT
where year(t_date_time_issued) = '2012'
GROUP BY month(t_date_time_issued),town,sitename
order by town
but i need something like
town sitename jan feb march april june july..... total
abc bbb 10 10 15 25 30 45 xyz..
ybb jjjj 25 14 45 25 312 455 ....uuu
----------
the query above excuted like this....
Town Site Name January February March April May June July August September October November December TOTAL
Bawtry Market Hill 155 0 0 0 0 0 0 0 0 0 0 0 155
Bawtry Market Hill 0 194 0 0 0 0 0 0 0 0 0 0 194
Bawtry Market Hill 0 0 144 0 0 0 0 0 0 0 0 0 144
Bawtry Market Hill 0 0 0 114 0 0 0 0 0 0 0 0 114
Formby The Cloisters 0 0 0 0 0 0 0 0 0 0 0 36 36
Kidderminster Crossley Retail Park 27 0 0 0 0 0 0 0 0 0 0 0 27
Kidderminster Crossley Retail Park 0 15 0 0 0 0 0 0 0 0 0 0 15
Kidderminster Crossley Retail Park 0 0 20 0 0 0 0 0 0 0 0 0 20