postgresql Multiple identical conditions are unified into one parameter

postgresql Multiple identical conditions are unified into one parameter - sql

I have one sql that need convert string column to array and i have to filter with this column，sql like this：
select
parent_line,
string_to_array(parent_line, '-')
from
bx_crm.department
where
status = 0 and
'851' = ANY(string_to_array(parent_line, '-')) and
array_length(string_to_array(parent_line, '-'), 1) = 5;
parent_line is a varchar(50) column，the data in this like 0-1-851-88
question:
string_to_array(parent_line, '-') appear many times in my sql.
how many times string_to_array(parent_line) calculate in each row. one time or three times
how convert string_to_array(parent_line) to a parameter. at last,my sql may like this:
depts = string_to_array(parent_line, '-')
select
parent_line,
depts
from
bx_crm.department
where
status = 0 and
'851' = ANY(depts) and
array_length(depts, 1) = 5;

Postgres supports lateral joins which can simplify this logic:
select parent_line, v.parents, status, ... other columns ...
from bx_crm.department d cross join lateral
(values (string_to_array(parent_line, '-')) v(parents)
where d.status = 0 and
cardinality(v.parents) = 5
'851' = any(v.parents)

Use a derived table:
select *
from (
select parent_line,
string_to_array(parent_line, '-') as parents,
status,
... other columns ...
from bx_crm.department
) x
where status = 0
and cardinality(parents) = 5
and '851' = any(parents)

Related

select subquery using data from the select statement?

I have two tables, headers and lines. I need to grab the batch_submission_date from the header table, but sometimes a query for batch_id will return a null for batch_submission_date, but will also return a parent_batch_id, and if we query THAT parent_batch_id as a batch_id, it will then return the correct batch_submission_date.
e.g.
SELECT t1.batch_id,
t1.parent_batch_id,
t2.batch_submission_date
FROM db.headers t1, db.lines t2
WHERE t1.batch_id = '12345';
output = 12345, 99999, null
Then we use that parent batch_id as a batch_id :
SELECT t1.batch_id,
t1.parent_batch_id,
t2.batch_submission_date
FROM db.headers t1, db.lines t2
WHERE t1.batch_id = '99999';
and we get output = 99999,99999,'2018-01-01'
So I'm trying to write a query that will do this for me - anytime a batch_id's batch_submission_date is null, we find that batch_id's parent batch_id and query that instead.
This was my idea - but I just get back null both for bp_batch_submission_date and for new_submission_date.
SELECT
t1.parent_id as parent_id,
t1.BATCH_ID as bp_batch_id,
t2.BATCH_LINE_NUMBER as bp_batch_li,
t1.BATCH_SUBMISSION_DATE as bp_batch_submission_date,
CASE
WHEN t1.BATCH_SUBMISSION_DATE is null
THEN
(SELECT a.BATCH_SUBMISSION_DATE
FROM
db.headers a,
db.lines b
WHERE
a.SD_BATCH_HEADERS_SKEY = b.SD_BATCH_HEADERS_SKEY
and a.parent_batch_id = bp_batch_id
and b.batch_line_number = bp_batch_li
) END as new_submission_date
FROM
db.headers t1,
db.lines t2
WHERE
t1.SD_BATCH_HEADERS_SKEY = t2.SD_BATCH_HEADERS_SKEY
and (t1.BATCH_ID = '12345' or t1.PARENT_BATCH_ID = '12345')
and t2.BATCH_LINE_NUMBER = '1'
GROUP BY
t2.BATCH_CLAIM_LINE_STATUS_DESC,
t1.PARENT_BATCH_ID,
t1.BATCH_ID,
t2.BATCH_LINE_NUMBER,
t1.BATCH_SUBMISSION_DATE;
is what I'm trying to do possible? using the bp_batch_id and bp_batch_li variables

Use CTE (common table expression) to avoid redundant code, then use coalesce() to find parent date in case of null. In your first queries you didn't attach joining condition between two tables, I assumed it's based on sd_batch_headers_skey like in last query.
dbfiddle demo
with t as (
select h.batch_id, h.parent_batch_id, l.batch_submission_date bs_date
from headers h
join lines l on l.sd_batch_headers_skey = h.sd_batch_headers_skey
and l.batch_line_number = '1' )
select batch_id, parent_batch_id,
coalesce(bs_date, (select bs_date from t x where x.batch_id = t.parent_batch_id)) bs_date
from t
where batch_id = 12345;
You could use simpler syntax with connect by and level <= 2 but if in your data there are really rows containing same ids (99999, 99999) then we get cycle error.

Combine 2 complex queries into 1

I am trying to figure out if there's a way to combine these 2 queries into a single one. I've run into the limits of what I know and can't figure out if this is possible or not.
This is the 1st query that gets last year sales for each day per location (for one month):
if object_id('tempdb..#LY_Data') is not null drop table #LY_Data
select
[LocationId] = ri.LocationId,
[LY_Date] = convert(date, ri.ReceiptDate),
[LY_Trans] = count(distinct ri.SalesReceiptId),
[LY_SoldQty] = convert(money, sum(ri.Qty)),
[LY_RetailAmount] = convert(money, sum(ri.ExtendedPrice)),
[LY_NetSalesAmount] = convert(money, sum(ri.ExtendedAmount))
into #LY_Data
from rpt.SalesReceiptItem ri
join #Location l
on ri.LocationId = l.Id
where ri.Ignored = 0
and ri.LineType = 1 /*Item*/
and ri.ReceiptDate between #_LYDateFrom and #_LYDateTo
group by
ri.LocationId,
ri.ReceiptDate
Then the 2nd query computes a ratio based on the total sales for that month for each day (to be used later):
if object_id('tempdb..#LY_Data2') is not null drop table #LY_Data2
select
[LocationId] = ly.LocationId,
[LY_Date] = ly.LY_Date,
[LY_Trans] = ly.LY_Trans,
[LY_RetailAmount] = ly.LY_RetailAmount,
[LY_NetSalesAmount] = ly.LY_NetSalesAmount,
[Ratio] = ly.LY_NetSalesAmount / t.MonthlySales
into #LY_Data2
from (
select
[LocationId] = ly.LocationId,
[MonthlySales] = sum(ly.LY_NetSalesAmount)
from #LY_Data ly
group by
ly.LocationId
) t
join #LY_Data ly
on t.LocationId = ly.LocationId
I've tried using the first query as a subquery in the 2nd query group-by from clause, but that won't let me select those columns in the outer most select statement (multi part identifier couldn't be bound).
As well as putting the first query into the join clause at the end of the 2nd query with the same issue.
There's probably something I'm missing, but I'm still pretty new to SQL so any help or just a pointer in the right direction would be greatly appreciated! :)

You can try using a Common Table Expression (CTE) and window function:
if object_id('tempdb..#LY_Data') is not null drop table #LY_Data
;with
cte AS
(
select
[LocationId] = ri.LocationId,
[LY_Date] = convert(date, ri.ReceiptDate),
[LY_Trans] = count(distinct ri.SalesReceiptId),
[LY_SoldQty] = convert(money, sum(ri.Qty)),
[LY_RetailAmount] = convert(money, sum(ri.ExtendedPrice)),
[LY_NetSalesAmount] = convert(money, sum(ri.ExtendedAmount))
from rpt.SalesReceiptItem ri
join #Location l
on ri.LocationId = l.Id
where ri.Ignored = 0
and ri.LineType = 1 /*Item*/
and ri.ReceiptDate between #_LYDateFrom and #_LYDateTo
group by
ri.LocationId,
ri.ReceiptDate
)
select
[LocationId] = cte.LocationId,
[LY_Date] = cte.LY_Date,
...
[Ratio] = cte.LY_NetSalesAmount / sum(cte.LY_NetSalesAmount) over (partition by cte.LocationId)
into #LY_Data
from cte
sum(cte.LY_NetSalesAmount) over (partition by cte.LocationId) gives you the sum for each locationId. The code assume that this sum is always non-zero. Otherwise, a divide-by-0 error will occur.

Seems like all you need to do is calculate ratio in the first query.
You can do this with a correlated subquery.
SELECT
...
convert(money, sum(ri.ExtendedAmount)/(SELECT sum(ri2.ExtendedAmount)
FROM rpt.SalesReceiptItem ri2
WHERE ri2.LocationId=ri.LocationId
)
) AS ratio --extended amount/total extended amount for this location

SQL when one column has duplicate rows, then select row where other column is the min value

I have this table
mt.id, mt.otherId, mt.name, mt.myChar, mt.type
1 10 stack U "question"
2 10 stack D
3 30 stack U "question"
4 10 stack U "whydownvotes"
And I want only
rows with id 2 and 3 returned (without using the id, otherid as parameter) and ensuring name and type are matching against parameters. And when there is a duplicate otherId = then return the row with min myChar value. So far I have this :
select mt.* from myTable mt
where (mt.myChar = 'U' AND (mt.name = 'stack' AND mt.type LIKE '%question%'))
or (mt.myChar = 'D' and mt.name = 'stack')
So where otherID is 10, I want the row with min char value 'D'. I am going to need a subquery or group using min(myChar) ... ?
How do i remove the first row from the sql fiddle (without using the id):
http://sqlfiddle.com/#!9/c579a/1
edit
Jeepers, whats with the downvotes, its clear question isn't it ? There is even a sql fiddle.

If this is SQL Server, then you can do it in two steps like this:
WITH filtered AS (
SELECT
mt.*,
minType = MIN(mt.type) OVER (PARTITION BY mt.otherId)
FROM
dbo.myTable AS mt
WHERE (mt.myChar = 'U' AND mt.name = 'stack' AND mt.type LIKE '%question%')
OR (mt.myChar = 'D' AND mt.name = 'stack')
)
SELECT
id,
otherId,
name,
myChar,
type
FROM
filtered
WHERE
type = minType
;
The filtered subquery is basically your current query but with an additional column that holds minimum type values per otherId. The main query filters the filtered set further based on the type = minType condition.

I am assuming you want is a groupwise maximum, one of the most commonly-asked SQL questions You can try as , This query should work on any DBMS. But If you are using the SQL SERVER then you can use the Row_Number() which is very easy to use.Here myTable is your table.
SELECT t0.*
FROM myTable AS t0
LEFT JOIN myTable AS t1 ON t0.otherId = t1.otherId AND t1.myChar < t0.myChar
WHERE t1.myChar IS NULL;
Here is sql fiddle

Need to load huge dataset (32 Million) into table using SSIS

I have a huge dataset to return in SQL Server( about 32 million rows) This is implemented in view and source code is as follows :
SELECT Idenitifier = ISNULL(mle.MIdeer, mle.Ider) + em.MemberId,
EffectiveDate = ISNULL(em.EffectiveDate,
(SELECT TOP 1 EffectiveDate
FROM c
WHERE SourceType = em.SourceType
AND GroupNumber = em.GroupNumber
AND ISNULL(GroupDivision, '') =
ISNULL(em.GroupDivision, '')))
FROM a em
JOIN b mle
ON mle.Identifier = em.GroupNumber + ISNULL('-' + em.GroupDivision, '')
-- Filter invalid legal entities
AND ISNULL(mle.Filter, 0) = 0
--- Gets a resultset of 531798 rows
CROSS JOIN -- this returns 63 rows , so
-- I am presuming 531798*63 rows here.
(SELECT *
FROM map
WHERE domaintype = 'MC')b;
I need to load this dataset using SSIS into a table. After 16 million rows, I am getting a system.out of memory exception in sql server when I am giving a select * from <<view>>. How to load this dataset in table using SSIS,avoiding this exception..
What other better methods to do this query efficiently as it takes more than 30 mins to run?

I'm still thinking through this, but you might need to separate the CROSS JOIN:
;WITH cte AS (SELECT ISNULL(mle.MIdeer, mle.Ider) + em.MemberId AS Idenitifier
, ISNULL(em.EffectiveDate,
( SELECT TOP 1
EffectiveDate
FROM c
WHERE SourceType = em.SourceType
AND GroupNumber = em.GroupNumber
AND ISNULL(GroupDivision, '') = ISNULL(em.GroupDivision,
'')
)) AS EffectiveDate
FROM a em
JOIN b mle ON mle.Identifier = em.GroupNumber + ISNULL('-'+ em.GroupDivision,'')
AND ISNULL(mle.Filter, 0) = 0)
SELECT *
FROM cte
CROSS JOIN ( SELECT *
FROM map
WHERE domaintype = 'MC') b;

Is it possible to replace the following two SQL selects with just one?

Please, observe:
DECLARE #UseFastLane BIT
SELECT TOP 1 #UseFastLane = 1
FROM BackgroundJobService
WHERE IsFastLane = 1;
SELECT TOP 1 bjs.HostName AllocatedAgentHostName,
bjs.ServiceName AllocatedAgentServiceName,
bjs.IsFastLane,
SUM(CASE
WHEN bjw.WorkStatusTypeId IN ( 2, 3, 4, 10 ) THEN 1
ELSE 0
END) AS InProgress
FROM BackgroundJobService bjs
LEFT JOIN BackgroundJobWork bjw
ON bjw.AllocatedAgentHostName = bjs.HostName
AND bjw.AllocatedAgentServiceName = bjs.ServiceName
WHERE bjs.AgentStatusTypeId = 2
AND bjs.IsFastLane = COALESCE(#UseFastLane, 0)
GROUP BY bjs.HostName,
bjs.ServiceName,
bjs.IsFastLane
ORDER BY IsFastLane DESC,
InProgress
I am using two SQL select statements here. Is it possible to use just one top level SQL select statement, nesting another one within?

You can replace the text AND bjs.IsFastLane = COALESCE(#UseFastLane, 0) with this:
AND bjs.IsFastLane = (SELECT Max(IsFastLane)
FROM BackgroundJobService)
which should give you an equivalent query assuming that there are rows in the BackgroundJobService.
If there might be zero rows in BackgroundJobService then you can wrap the select with a COALESCE function to return 0, like this:
COALESCE((SELECT Max(IsFastLane) FROM BackgroundJobService), 0)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

postgresql Multiple identical conditions are unified into one parameter - sql

Postgres supports lateral joins which can simplify this logic: select parent_line, v.parents, status, ... other columns ... from bx_crm.department d cross join lateral (values (string_to_array(parent_line, '-')) v(parents) where d.status = 0 and cardinality(v.parents) = 5 '851' = any(v.parents)

Use a derived table: select * from ( select parent_line, string_to_array(parent_line, '-') as parents, status, ... other columns ... from bx_crm.department ) x where status = 0 and cardinality(parents) = 5 and '851' = any(parents)

Related

select subquery using data from the select statement?

Combine 2 complex queries into 1

SQL when one column has duplicate rows, then select row where other column is the min value

Need to load huge dataset (32 Million) into table using SSIS

Is it possible to replace the following two SQL selects with just one?

Categories

Resources