I have to create a report that measures the total credits received on an application. My problem is that I have 4 separate products with varying criteria in the where clause that don't allow me to make just one query and be done with the dataset. I originally was going to make 4 separate queries and then union join them all together.
I don't know if I'm limiting myself b/c of my skillset and I'm wondering if a Union Join is the best approach here. Should I make a table instead and insert this data into it. Then use that table for my report? Instead of making 4 separate queries and unioning them together?
Here's a snippet of one of the q's. Each of the other three are similar but different. I'm not specifically searching for help with this code, as much as I'm looking for the concept I should apply to complete the task.
SELECT distinct
w.application_id,
w.product_id,
X.scenario_id,
X.history_id,
x.is_applied,
product.product_name,
475 as total_fee_amount,
w.status,
w.funding_status,
w.amt_requested
FROM FEE INNER JOIN
(select * from APP
where ((status = 'W') or (APP.funding_status = 'F' and APP.status = 'A'
and APP.delete_app <> 1 and amt_requested between '25000' and '150000'))) as w
ON FEE.history_id = w.history_id INNER JOIN
product ON w.product_id = product.product_id INNER JOIN
(select scenario_id, history_id, is_applied from calc_history
where is_applied = '1' AND save_type = '0' ) as X
ON FEE.history_id = X.history_id AND FEE.scenario_id = X.scenario_id
WHERE
product.product_id in ('1064','1053','1065')
I'm using SQL Server 2008
Related
I make related queries and the counting does not work correctly, when I connect 4 and join and add a condition, it does not count correctly, but without the 4th joina and the condition it works correctly. first option result = 2
SELECT
pxixolog_details.*,
directions.direction,
COUNT(directions.direction) procent
FROM
pxixolog_details
LEFT JOIN psixologs_direction ON pxixolog_details.id = psixologs_direction.psixolog_id
LEFT JOIN directions ON directions.id = psixologs_direction.direction_id
LEFT JOIN psixologs_weeks ON pxixolog_details.id = psixologs_weeks.psixolog_id
WHERE
directions.direction IN(
'Трудности в отношениях',
'Проблемы со сном',
'Нежелательная агрессия'
)
AND birthday BETWEEN '1956-04-29' AND '2021-04-29' AND psixologs_weeks.week = '4'
GROUP BY
pxixolog_details.id
and the second one doesn't work correctly. result = 4
SELECT
pxixolog_details.*,
directions.direction,
COUNT(directions.direction) procent
FROM
pxixolog_details
LEFT JOIN psixologs_direction ON pxixolog_details.id = psixologs_direction.psixolog_id
LEFT JOIN directions ON directions.id = psixologs_direction.direction_id
LEFT JOIN psixologs_weeks ON pxixolog_details.id = psixologs_weeks.psixolog_id
LEFT JOIN psixologs_times ON pxixolog_details.id = psixologs_times.psixolog_id
WHERE
directions.direction IN(
'Трудности в отношениях',
'Проблемы со сном',
'Нежелательная агрессия'
)
AND birthday BETWEEN '1956-04-29' AND '2021-04-29' AND psixologs_weeks.week = '4'
AND (psixologs_times.time = '09:00' OR psixologs_times.time = '10:00')
GROUP BY
pxixolog_details.id
what am I doing wrong?
You get double the amount of results when doing 4 JOINs because through the new (4th) JOIN you allow 2 records (9:00 and 10:00 o'clock) for each of the other joined records in the first 3 JOINs. That can lead to the observed result.
Check your data and make sure that your 4th JOIN condition yields a 1:1 record matching with the other data.
The last table has psixologs_times matches multiple rows for each psixolog_id.
You can easily see this using a query:
select psixolog_id, count(*)
from psixologs_times
group by psixolog_id
having count(*) > 1;
How you fix this problem depends on what you want to do. The simplest solution is to use count(distinct):
COUNT(DISTINCT directions.direction) as procent
However, this might just be hiding the problem. You might want to choose one row from the psixologs_times table. Or pre-aggregate it. Or do something else.
I am trying to write a query that gets Income Statements for Investments and calculates how much each has made dependent on the time frame.
SELECT
iis.GrossIncome, dates.[Month], dates.[Year], s.TotalGrossIncome
FROM
dbo.InvestmentIncomeStatements AS iis
JOIN
dbo.IssueDates AS dates ON iis.IssueDateID = dates.ID
JOIN
(SELECT
f.InvestmentID, SUM(f.GrossIncome) AS TotalGrossIncome
FROM
(SELECT
stment.InvestmentID, stment.GrossIncome, inment.[Name] as InvestmentName
FROM
dbo.InvestmentIncomeStatements stment
JOIN
dbo.IssueDates AS dts ON stment.IssueDateID = dts.ID
JOIN
dbo.Investments inment ON stment.InvestmentID = inment.ID
WHERE
dts.[Month] + dts.[Year] < '???') AS f
GROUP BY
f.InvestmentID, f.InvestmentName) AS s ON iis.InvestmentID = s.InvestmentID
In the place of the '???' I would like to write 'dates.[Year] + dates.[Month]'.
However I can't refer it. What should i do?
I have a problem I can't really figure out, even though I thought I had the solution.
I think this is DB2 SQL by the way.
I have a customer number and a country code (extracted from a string using SUBSTR) which I don't want to find in combination in a subquery, like so:
SELECT ku.orgnr AS customer ,
Substr(bu.bank_account_swiftadr,5,2) AS country
FROM db811.bet_utl bu
LEFT JOIN db811.henv_utl bh
ON bu.betaling_urn = bh.betaling_urn
LEFT JOIN db811.betaling_status bs
ON bu.betaling_status = bs.betaling_status
LEFT JOIN db811.kunde_orgnr ku
ON bu.kundenr = ku.kundenr
WHERE bu.kanal = 'N'
AND (ku.orgnr, Substr(bu.bank_account_swiftadr,5,2)) ;
Not in the results below
SELECT ku.orgnr AS customer ,
Substr(bu.bank_account_swiftadr,5,2) AS country ,
COUNT(*) AS numberof
FROM db811.bet_utl_hist bu
LEFT JOIN db811.kunde_orgnr ku
ON bu.kundenr = ku.kundenr
WHERE
and bu.kanal = 'N'
AND bu.betalingsdato > '2016-01-01'
GROUP BY ku.orgnr ,
substr(bu.bank_account_swiftadr,5,2);
This should work I though, but it seems to match on just one of them, and I need both to be true in order for me to exclude it with the NOT IN.
I assume I am missing something basic since I am quite new at this.
The code below is supposed to return unique records in the lp_num field from the subquery to then be used in the outer query, but I am still getting multiples of the lp_num field. A ReferenceNumber can have multiple ApptDate records, but each lp_num can only have 1 rf_num. That's why I tried to retrieve unique lp_num records all the way down in the subquery, but it doesn't work. I am using Report Builder 3.0.
Current Output
Screenshot
The desired output would be to have only unique records in the lp_num field. This is because each value in the lp_num field is a pallet, one single pallet. the info to the right is when it arrived (ApptDate) and what the reference number is for the delivery (ref_num). Therefore, it makes no sense for a pallet to have multiple receipt dates...it can only arrive once...
SELECT DISTINCT
dbo.ISW_LPTrans.item,
dbo.ISW_LPTrans.lot,
dbo.ISW_LPTrans.trans_type,
dbo.ISW_LPTrans.lp_num,
dbo.ISW_LPTrans.ref_num,
(MIN(CONVERT(VARCHAR(10),dbo.CW_CheckInOut.ApptDate,101))) as appt_date_only,
dbo.CW_CheckInOut.ApptTime,
dbo.item.description,
dbo.item.u_m,
dbo.ISW_LPTrans.qty,
(CASE
WHEN dbo.ISW_LPTrans.trans_type = 'F'
THEN 'Produced internally'
ELSE
(CASE
WHEN dbo.ISW_LPTrans.trans_type = 'R'
THEN 'Received from outside'
END)
END
) as original_source
FROM
dbo.ISW_LPTrans
INNER JOIN dbo.CW_Dock_Schedule ON LTRIM(RTRIM(dbo.ISW_LPTrans.ref_num)) = dbo.CW_Dock_Schedule.ReferenceNumber
INNER JOIN dbo.CW_CheckInOut ON dbo.CW_CheckInOut.TruckID = dbo.CW_Dock_Schedule.TruckID
INNER JOIN dbo.item ON dbo.item.item = dbo.ISW_LPTrans.item
WHERE
(dbo.ISW_LPTrans.trans_type = 'R') AND
--CONVERT(VARCHAR(10),dbo.CW_CheckInOut.ApptDate,101) <= CONVERT(VARCHAR(10),dbo.ISW_LPTrans.trans_date,101) AND
dbo.ISW_LPTrans.lp_num IN
(SELECT DISTINCT
dbo.ISW_LPTrans.lp_num
FROM
dbo.ISW_LPTrans
INNER JOIN dbo.item ON dbo.ISW_LPTrans.item = dbo.item.item
INNER JOIN dbo.job ON dbo.ISW_LPTrans.ref_num = dbo.job.job AND dbo.ISW_LPTrans.ref_line_suf = dbo.job.suffix
WHERE
(dbo.ISW_LPTrans.trans_type = 'W' OR dbo.ISW_LPTrans.trans_type = 'I') AND
dbo.ISW_LPTrans.ref_num IN
(SELECT
dbo.ISW_LPTrans.ref_num
FROM
dbo.ISW_LPTrans
--INNER JOIN dbo.ISW_LPTrans on dbo.ISW_LPTrans.
WHERE
dbo.ISW_LPTrans.item LIKE #item AND
dbo.ISW_LPTrans.lot LIKE #lot AND
dbo.ISW_LPTrans.trans_type = 'F'
GROUP BY
dbo.ISW_LPTrans.ref_num
) AND
dbo.ISW_LPTrans.ref_line_suf IN
(SELECT
dbo.ISW_LPTrans.ref_line_suf
FROM
dbo.ISW_LPTrans
--INNER JOIN dbo.ISW_LPTrans on dbo.ISW_LPTrans.
WHERE
dbo.ISW_LPTrans.item LIKE #item AND
dbo.ISW_LPTrans.lot LIKE #lot AND
dbo.ISW_LPTrans.trans_type = 'F'
GROUP BY
dbo.ISW_LPTrans.ref_line_suf
)
GROUP BY
dbo.ISW_LPTrans.lp_num
HAVING
SUM(dbo.ISW_LPTrans.qty) < 0
)
GROUP BY
dbo.ISW_LPTrans.item,
dbo.ISW_LPTrans.lot,
dbo.ISW_LPTrans.trans_type,
dbo.ISW_LPTrans.lp_num,
dbo.ISW_LPTrans.ref_num,
dbo.CW_CheckInOut.ApptDate,
dbo.CW_CheckInOut.ApptTime,
dbo.item.description,
dbo.item.u_m,
dbo.ISW_LPTrans.qty
ORDER BY
dbo.ISW_LPTrans.lp_num
In a nutshell - the way you use DISTINCT is logically wrong from SQL perspective.
Your DISTINCT is in an IN subquery in the WHERE clause - and at that point of code it has absolutely no effect (except from the performance penalty). Think on it - if the outer query returns non-unique values of dbo.ISW_LPTrans.lp_num (which obvioulsy happens) those values can still be within the distinct values of the IN subquery - the IN does not enforce a 1-to-1 match, it only enforces the fact that the outer query values are within the inner values, but they can match multiple times. So it is definitely not DISTINCT's fault.
I would go through the following check steps:
See if there is insufficient JOIN ON condition(s) in the outer FROM section that leads to data multiplication (e.g. if a table has primary-to-foreign key relation on several columns, but you join on one of them only etc.).
Check which of the sources contains non-distinct records in the outer FROM section - then either cleanse your source, or adjust the JOIN condition and / or the WHERE clause so that you only pick distinct & correct records. In fact you might need to SELECT DISTINCT in the FROM sections - there it would make much more sense.
What is generally considered the most efficient way to do this type of query?
We have a database of 10 years worth of laboratory data and we would like to select out performance data for various tests. This query for example will select the number of hours its taken to do a test and calculate an average turnaround time and allow us to plot a sparkline of avg TAT per day.
Say we have 100 test names is it acceptable in terms of performance to iterate over the test names in a loop and fire this query off once per loop? Or is there a more efficient way?
SELECT
Date_Authorised_Index.Date_Authorised
, Result_Set.Date_Booked_In
, avg(DATEDIFF('hh',Result_Set.Date_Time_Booked_In,Result_Set.Date_Time_Authorised)) as HrsIn
, count(Date_Authorised_Index.Date_Authorised) as numbers
, Date_Authorised_Index.Registration_Number
, Date_Authorised_Index.Request_Row_ID
, Date_Authorised_Index.Specimen_Number
, Result_Set.Authorised_By
, Result_Set.Namespace
, Result_Set.Set_Code
, Result_Set.Date_Time_Authorised
, Request.Date_Time_Received
, Request.Location
FROM
Date_Authorised_Index Date_Authorised_Index
, Result_Set Result_Set
, Request
WHERE
Date_Authorised_Index.Date_Authorised = Result_Set.Date_Authorised
AND Date_Authorised_Index.Request_Row_ID = Request.Request_Row_ID
AND Date_Authorised_Index.Request_Row_ID = Result_Set.Request_Row_ID
AND (Date_Authorised_Index.Discipline='C') AND Result_Set.Set_Code=?
GROUP BY
Result_Set.Date_Booked_In
For starters I would rewrite this query so it uses explicit join syntax.
Also even though MySQL does not force you to restate every non-aggregate column in the group by clause that doesn't mean that's a good thing.
Unless the Result_Set.Date_Booked_In uniquely identifies a row, you are selecting random values from a multiple of rows.
SELECT
dai.Date_Authorised
, rs.Date_Booked_In
, avg(DATEDIFF('hh',rs.Date_Time_Booked_In,rs.Date_Time_Authorised)) as HrsIn
, count(dai.Date_Authorised) as numbers
, dai.Registration_Number
, dai.Request_Row_ID
, dai.Specimen_Number
, rs.Authorised_By
, rs.Namespace
, rs.Set_Code
, rs.Date_Time_Authorised
, r.Date_Time_Received
, r.Location
FROM
Date_Authorised_Index dai
INNER JOIN Result_Set rs ON (dai.Date_Authorised = rs.Date_Authorised
AND dai.Request_Row_ID = rs.Request_Row_ID)
INNER JOIN Request R ON (dai.Request_Row_ID = r.Request_Row_ID)
WHERE
(dai.Discipline= 'C') AND rs.Set_Code=?
GROUP BY
rs.Date_Booked_In
If you want to select a 100 rows in one go, just make a new table with the set_codes you want to select and join against that.
Make sure you index the field sc.set_code (or better yet make it the primary key)
SELECT lots_of_columns
FROM table1 as dai
INNER JOIN table2 as rs ON (what you joined on before)
INNER JOIN table3 as r ON (same here)
INNER JOIN Setcodes as sc ON (sc.Set_code = rs.SetCode) <<-- extra join.
WHERE
dai.discipline = 'C'
GROUP BY rs.Date_Booked_In
Or you can even use a `IN (...) like below, although that will propably be slower than a join.
SELECT lots_of_columns
FROM table1 as dai
INNER JOIN table2 as rs ON (what you joined on before)
INNER JOIN table3 as r ON (same here)
WHERE
dai.discipline = 'C' AND rs.Set_Code IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
GROUP BY rs.Date_Booked_In