Query on large table - sql

I have the following query -
SELECT n.fname
,ii.render AS practitioner_npi
,n.address AS address1
,substring(n.postal,0,6) AS zip
,substring(n.postal,6,4) AS zip4
,ii.count AS count
FROM
(
SELECT render, count(*) AS count
FROM dx sl
JOIN annual caq
ON DATE_TRUNC('quarter', date_of_service::date) >= caq.start
JOIN entities n
ON sl.render = n.npi
WHERE dx_cd IN (
SELECT DISTINCT dx_cd
FROM dx_per_code pc
JOIN bucket bac
ON pc.code = bac.hcpccode
WHERE
bucketname = 'something'
AND dx_rank BETWEEN 1 AND 5
)
AND n.npi_type = '1'
GROUP BY render
)
ii
JOIN npi n ON n.npi = ii.render
LEFT JOIN taxonomy t ON t.code = n.taxonomy
ORDER BY ii.count DESC;
The dx table does not have any indexes and contains approax 8B records. This query currently takes 20 minutes to run. What indexes/optimizations can I make to get this to run faster?

Related

How to get Odoo Inventory adjustment value through SQL

I am working on a custom stock valuation module and in one model I am trying to get adjustment value for a lot - product - warehouse wise of the previous day.
QUERY 1
SELECT COUNT(*)
FROM
(
SELECT stock_inventory.date AS stock_adjustment_date,
stock_move_line.lot_id,
stock_move_line.product_id,
SUM(stock_move_line.qty_done) total_stock_adjustment
FROM stock_move_line
LEFT JOIN stock_move ON stock_move_line.move_id = stock_move.id
LEFT JOIN stock_inventory ON stock_move.inventory_id = stock_inventory.id
WHERE stock_move.inventory_id IS NOT NULL
AND stock_move_line.location_id = 5
AND stock_move_line.location_dest_id = 13
AND stock_move_line.lot_id IS NOT NULL
GROUP BY stock_move_line.lot_id, stock_move_line.product_id, stock_inventory.date
ORDER BY total_stock_adjustment DESC
)
testTable;
QUERY 2
SELECT COUNT(*)
FROM
(
SELECT stock_inventory.date AS stock_adjustment_date,
stock_move_line.lot_id,
stock_move_line.product_id,
SUM(stock_move_line.qty_done) total_stock_adjustment
FROM stock_move_line
LEFT JOIN stock_move ON stock_move_line.move_id = stock_move.id
LEFT JOIN stock_inventory ON stock_move.inventory_id = stock_inventory.id
WHERE stock_move.inventory_id IS NOT NULL
AND stock_move_line.location_id = 13
AND stock_move_line.location_dest_id = 5
AND stock_move_line.lot_id IS NOT NULL
GROUP BY stock_move_line.lot_id, stock_move_line.product_id, stock_inventory.date
ORDER BY total_stock_adjustment DESC
)
testTable;
Why these both queries returning same count 14,849 ?
13 is the warehouse ID and 5 is the virtual location used for adjustment. What I am doing wrong here?

altering query in db2 to fix count from a join

I'm getting an aggregated count of records for orders and I'm getting the expected count on this basic query:
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)||'-'||substr(g.EXTD1d,7,2) ) between current_Date - 180 DAY AND current_Date
But as soon as I add back in my joins and joined values then my count goes from 1 (which it should be) to over 200. All I need from these joins is the customer ID and the manager number. so even if my count is high, I'm basically just trying to say "for this cstnoc, give me the slsupr and xlsno"
How can I perform this below query without affecting the count? I only want my count (sales_180 and velocity) coming from the custgroup table based on my where clause, but I then just want one value of the xcstno and xslsno based on the cstnoc.
SELECT
count(*) as sales_180,
180/count(*) as velocity,
c.xslsno as CustID,
cr.slsupr as Manager
FROM custgroup g
inner join customers c
on g.cstnoc = c.xcstno
inner join managers cr
on c.xslsno = cr.xslsno
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)||'-'||substr(g.EXTD1d,7,2) ) between current_Date - 180 DAY AND current_Date
GROUP BY c.xslsno, cr.slsupr
You are producing multiple rows when joining, so your count is now counting all the resulting rows with all that [unintended] multiplicity.
The solution? Use a table expression to pre-compute your count, and then you can join it to the other tables, as in:
select
g2.sales_180,
g2.velocity,
c.xslsno as CustID,
cr.slsupr as Manager
from customers c
join managers cr on c.xslsno = cr.xslsno
join ( -- here the Table Expression starts
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)
||'-'||substr(g.EXTD1d,7,2) )
between current_Date - 180 DAY AND current_Date
) g2 on g2.cstnoc = c.xcstno
You can also use a Common Table Expression (CTE) that will produce the same result:
with g2 as (
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)
||'-'||substr(g.EXTD1d,7,2) )
between current_Date - 180 DAY AND current_Date
)
select
g2.sales_180,
g2.velocity,
c.xslsno as CustID,
cr.slsupr as Manager
from customers c
join managers cr on c.xslsno = cr.xslsno
join g2 on g2.cstnoc = c.xcstno

How to use alias of a subquery to get the running total?

I have a UNION of 3 tables for calculating some balance and I need to get the running SUM of that balance but I can't use PARTITION OVER, because I must do it with a sql query that can work in Access.
My problem is that I cannot use JOIN on an alias subquery, it won't work.
How can I use alias in a JOIN to get the running total?
Or any other way to get the SUM that is not with PARTITION OVER, because it does not exist in Access.
This is my code so far:
SELECT korisnik_id, imePrezime, datum, Dug, Pot, (Dug - Pot) AS Balance
FROM (
SELECT korisnik_id, k.imePrezime, r.datum, SUM(IIF(u.jedinstven = 1, r.cena, k.kvadratura * r.cena)) AS Dug, '0' AS Pot
FROM Racun r
INNER JOIN Usluge u ON r.usluga_id = u.ID
INNER JOIN Korisnik k ON r.korisnik_id = k.ID
WHERE korisnik_id = 1
AND r.zgrada_id = 1
AND r.mesec = 1
AND r.godina = 2017
GROUP BY korisnik_id, k.imePrezime, r.datum
UNION ALL
SELECT korisnik_id, k.imePrezime, rp.datum, SUM(IIF(u.jedinstven = 1, rp.cena, k.kvadratura * rp.cena)) AS Dug, '0' AS Pot
FROM RacunP rp
INNER JOIN Usluge u ON rp.usluga_id = u.ID
INNER JOIN Korisnik k ON rp.korisnik_id = k.ID
WHERE korisnik_id = 1
AND rp.zgrada_id = 1
AND rp.mesec = 1
AND rp.godina = 2017
GROUP BY korisnik_id, k.imePrezime, rp.datum
UNION ALL
SELECT uu.korisnik_id, k.imePrezime, uu.datum, '0' AS Dug, SUM(uu.iznos) AS Pot
FROM UnosUplata uu
INNER JOIN Korisnik k ON uu.korisnik_id = k.ID
WHERE korisnik_id = 1
GROUP BY uu.korisnik_id, k.imePrezime, uu.datum
) AS a
ORDER BY korisnik_id
You can save a query (let's name it Query1) for the UNION of the 3 tables and then create another query that returns each row in the first query and calculates the sum of the rows that are before it (optionally checking that they are in the same group).
It should be something like this:
SELECT *, (
SELECT SUM(Value) FROM Query1 AS b
WHERE b.GroupNumber=a.GroupNumber
AND b.Position<=a.Position
) AS RunningSum
FROM Query1 AS a
However, it's more efficient to do that in the report.

Get count from table given input from another

I have 3 tables: filer_info, filer_persent and persent_email, connected via:
filer_info.filer_info_id = filer_persent.filer_info_id
filer_persent.persent_info_id = persent_email.persent_info_id
I want to find all rows where I have multiple type PRIMARY in the persent_email table (ie count > 1). And the only thing I want to return in the query is filer_info.filer_ident and the count.
This gives me every row, but I only want the data where filer_ident > 1 in the returned rows.
select * from filer_info f
inner join filer_persent fp on f.filer_info_id=fp.filer_info_id
inner join persent_email p on fp.persent_info_id=p.persent_info_id
where fp.filer_persent_kind_cd = 'FILER' and p.persent_email_kind_cd='PRIMARY'
order by f.filer_ident
SELECT filer_ident, cnt
FROM (
SELECT filer_info_id, COUNT(*) cnt
FROM filer_persent fp
JOIN persent_email p
USING (persent_info_id)
WHERE (fp.filer_persent_kind_cd, p.persent_email_kind_cd) = ('FILER', 'PRIMARY')
HAVING COUNT(*) > 1
) с
JOIN filer_info f
USING (filer_info_id)
WHERE filer_ident > 1
ORDER BY
f.filer_ident

Rollup / recursive addition SQL Server 2008

I have a query with rollup that outputs data like (the query is a little busy, but I can post if necessary)
range subCounts Counts percent
1-9 3 100 3.0
10-19 13 100 13.0
20-29 30 100 33.0
30-39 74 100 74.0
NULL 100 100 100.0
How is it possible to keep a running summation total of percent? Say I need to find the bottom 15 percentile, in this case 3+13=16 so I would like for the last row to be returned read
range subCounts counts percent
10-19 13 100 13.0
EDIT1: here the query
select '$'+cast(+bin*10000 + ' ' as varchar(10)) + '-' + cast(bin*10000+9999 as varchar(10)) as bins,
count(*) as numbers,
(select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')
) as Totals
, round(100*count(*)/cast((select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')) as float),2) as binsPercent
from
(
select tblclaims.patientid, sum(claimsmedicarepaid) as TotalCosts,
cast(sum(claimsmedicarePaid)/10000 as int) as bin
from tblclaims inner join patient on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on patient.hospitalnpi = tblhospitals.hospitalnpi
where tblhospitals.hospitalname = 'X'
group by tblclaims.patientid
) as t
group by bin with rollup
OK, so for whomever might use this for reference I figured out what I needed to do.
I added row_number() over(bin) as rownum to the query and saved all of this as a view.
Then I used
SELECT *,
SUM(t2.binspercent) AS SUM
FROM t t1
INNER JOIN t t2 ON t1.rownum >= t2.rownum
GROUP BY t1.rownum,
t1.bins, t1.numbers, t1.uktotal, t1.binspercent
ORDER BY t1.rownum
by joining t1.rownum >=t2.rownum you can get the rolling count sort of thing.
This isn't exactly what i was looking for, but it's on the same track:
http://blog.tallan.com/2011/12/08/sql-server-2012-windowing-functions-part-1-of-2-running-and-sliding-aggregates/ and http://blog.tallan.com/2011/12/19/sql-server-2012-windowing-functions-part-2-of-2-new-analytic-functions/ - check out PERCENT_RANK
CUME_DIST
PERCENTILE_CONT
PERCENTILE_DISC
Sorry for the lame answer