How do I make a query shorter and neater?

How do I make a query shorter and neater? - sql

Im trying to make This query more understandable and neater. But im not sure how to?
SELECT a.Patient_id, COUNT (p.Person_id) AS "Number of Operations", SUM (w.Daily_charge * (a.Discharge_date - a.Admission_date) + ot.Theatre_fee + b.Charges + c.Charges ) AS "Total Payment"
FROM person p, admission a, ward w, operation o, operation_type ot, staff b, staff c
WHERE w.Ward_code = a.Ward_code AND p.Person_id = a.Patient_id
AND a.Admission_id = o.Admission_id AND ot.Op_code = o.Actual_op
AND o.Surgeon = b.Person_id AND o.Anaesthetist = c.Person_id
GROUP BY a.Patient_id, p.Person_id
ORDER BY COUNT (p.Person_id) DESC FETCH FIRST 1 ROWS ONLY;

Any decent formatter would do it for you.
Other than that,
JOIN instead of comma-separate tables in the FROM clause
remove p.person_id from group by clause, there's no use of it as it is
equal to a.patient_id which is correctly put into the clause,
not part of the select statement's column list
So:
select a.patient_id,
count (p.person_id) as "number of operations",
sum (w.daily_charge * (a.discharge_date - a.admission_date) +
ot.theatre_fee + b.charges + c.charges
) as "total payment"
from person p join admission a on a.patient_id = p.person_id
join ward w on w.ward_code = a.ward_code
join operation o on o.admission_id = a.admission_id
join operation_type ot on ot.op_code = o.actual_op
join staff b on o.surgeon = b.person_id
join staff c on o.anaesthetist = c.person_id
group by a.patient_id
order by count (p.person_id) desc
fetch first 1 rows only;

Sadly there is no perfect formatter, although as Ed mentioned in the comments, they can be a start if you review the settings carefully. (It's a tradition in the industry that the default formatter settings are always horrible.)
It's also been said (I think by Steven Feuerstein) that you should only set formatting rules that are supported by your formatter, and of course he makes a good point. But taken with the limitations of all formatters, an industry tradition for horrible formatting and the impossibility of consistent rules for formatting SQL anyway, that puts us PL/SQL developers in a difficult position.
I'd say the first principle of computer code layout is to use vertically aligned blocks to indicate dependency levels (similar to the grids used in graphic design). A lot of the choices then become about how to apply that principle.
We then need to separate the code into logical sections, but at the same time not let it sprawl down the page unnecessarily. I think this is difficult for automated formatters as the rules become a bit fuzzy, e.g. for a join with only one condition I keep it on one line, but if there is more than one I start splitting it out onto multiple lines, one per on or and keyword. The same goes for your complex sum() expression - normally I would place it all on one line, but if it aids readability then I split it up.
Finally, opinions vary on where to place commas in stacked lists, of which SQL has a lot. I say they go on the left, to act like bullet points and also make it easier to add items to the ends of lists. Others will disagree.
select ad.patient_id
, count(*) as "Number of Operations"
, sum(
wa.daily_charge * (ad.discharge_date - ad.admission_date)
+ ot.theatre_fee + ss.charges + sa.charges
) as "Total Payment"
from person pr
join admission ad on ad.patient_id = pr.person_id
join ward wa on wa.ward_code = ad.ward_code
join operation op on op.admission_id = ad.admission_id
join operation_type ot on ot.op_code = op.actual_op
join staff ss on ss.person_id = op.surgeon
join staff sa on sa.person_id = op.anaesthetist
group by ad.patient_id
order by count(*) desc
fetch first row only;

Related

How to combine taking several joins and add constraints on query?

How can I answer the following question by quering this database:
The police is looking for a brown hair coloured woman that checked in in the gym somewhere between september 8th 2016 and october 24th 2016.This woman has a silver membership. Can we find the name of this woman?
I tried the following query:
dbGetQuery(db,"
SELECT *
FROM get_fit_now_member
JOIN get_fit_now_check_in ON id = membership_id
WHERE check_in_date BETWEEN '20160909' AND '20161023' AND membership_status = 'silver'
")
This gives me the following output:
The problem is that I have to join multiple times and at the same time have to add different constrains. How can I solve this question in a clever way?

Here's how I would write the query:
SELECT m.name
FROM get_fit_now_member AS m
JOIN get_fit_now_check_in AS c ON m.id = c.membership_id
JOIN person AS p ON m.person_id = p.id
JOIN drivers_license AS d ON p.license_id = d.id
WHERE c.check_in_date BETWEEN '20160908' AND '20161024'
AND m.membership_status = 'silver'
AND d.hair_color = 'brown';
JOIN is just an operator, like + is in arithmetic. In arithmetic, you can extend the expressions with more terms, like a + b + c + d. In SQL, you can use JOIN multiple times in a similar way.
I used correlation names (m, c, p, d) to make it more convenient to qualify the table names, so I can be clear for example which id I mean in each join condition, since a column named id exists in multiple tables.
I also changed the date expression, because I assume "between" is meant to include the two dates named in the problem statement.

How would you explain this query in layman terms?

Here is the database I'm using: https://drive.google.com/file/d/1ArJekOQpal0JFIr1h3NXYcFVngnCNUxg/view?usp=sharing
select distinct
AC1.givename, AC1.famname, AC2.givename, AC2.famname
from
academic AC1, author AU1, academic AC2, author AU2
where
AC1.acnum = AU1.acnum
and AC2.acnum = AU2.acnum
and AU1.panum = AU2.panum
and AU2.acnum > AU1.acnum
and not exists (select *
from Interest I1, Interest I2
where I1.acnum = AC1.acnum
and I2.acnum = AC2.acnum);
Output:
I'm having trouble explaining this output of the subquery and query in layman terms(Normal english).
Not sure if my explanation is right:
"The subquery finds the interested fields where two authors have no common field of interest.
The whole query finds the first and last names of the authors of papers which have at least two authors, and have no common field of interest."

As it currently stands, the subquery will produce rows if each academic has at least one interest.
So overall, the query is "produce pairs of academics who co-authored at least one paper and where at least one of them has no interests whatsoever". It's difficult to believe that that was the intent, and if it was, there are clearer ways of writing it that make it more clear that that is what we're looking for.
If that's the query we want, though, I'd write it as:
SELECT
AC1.givename, AC1.famname, AC2.givename, AC2.famname
FROM
academic AC1
inner join
academic AC2
on
AC1.acnum < AC2.acnum
WHERE EXISTS
(select * from author au1 inner join author au2 on au1.panum = au2.panum
where au1.acnum = ac1.acnum and au2.acnum = ac2.acnum)
AND
(
NOT EXISTS (select * from interest i where i.acnum = ac1.acnum)
OR
NOT EXISTS (select * from interest i where i.acnum = ac2.acnum)
)
If, as is more likely, we wanted pairs of co-authors who have no interests in common, we would write something like:
SELECT
AC1.givename, AC1.famname, AC2.givename, AC2.famname
FROM
academic AC1
inner join
academic AC2
on
AC1.acnum < AC2.acnum
WHERE EXISTS
(select * from author au1 inner join author au2 on au1.panum = au2.panum
where au1.acnum = ac1.acnum and au2.acnum = ac2.acnum)
AND NOT EXISTS
(select * from interest i1 inner join interest i2 on i1.field = i2.field
where i1.acnum = ac1.acnum and i2.acnum = ac2.acnum)
Notice how neither of my queries uses distinct, because we've made sure that the outer query isn't joining additional rows where we only care about the existence or absence of those rows - we've moved all such checks into EXISTS subqueries.
I generally see distinct used far too often when the author is getting multiple results when they only want a single result and they're unwilling to expend the effort to discover why they're getting multiple results. In this case, it would be situations where the same pairs of academics have co-authored more than one paper.

Two almost identical queries returning different results

I am getting different results for the following two queries and I have no idea why. The only difference is one has an IN and one has an equals.
Before I go into the queries you should know that I found a better way to do it by moving the subquery into a common table expression, but this is still driving me crazy! I really want to know what caused the issue in the first place, I am asking out of curiosity
Here's the first query:
use [DB.90_39733]
Select distinct x.uniqproducer, cn.Firstname,cn.lastname,e.code,
ecn.FirstName, ecn.LastName, ecn.entid, x.uniqline
from product x
join employ e on e.EmpID=x.uniqproducer
join contactname cn on cn.uniqentity=e.uniqentity
join [ETL_GAWR92]..idlookupentity ide on ide.enttype='EM'
and ide.UniqEntity=e.UniqEntity
left join [ETL_GAWR92]..EntConName ecn on ecn.entid=ide.empid
and ecn.opt='Y'
Where x.UniqProducer =(SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM')
And the second one:
use [DB.90_39733]
Select distinct x.uniqproducer, cn.Firstname,cn.lastname,e.code,
ecn.FirstName, ecn.LastName, ecn.entid, x.uniqline
from product x
join employ e on e.EmpID=x.uniqproducer
join contactname cn on cn.uniqentity=e.uniqentity
join [ETL_GAWR92]..idlookupentity ide on ide.enttype='EM'
and ide.UniqEntity=e.UniqEntity
left join [ETL_GAWR92]..EntConName ecn on ecn.entid=ide.empid
and ecn.opt='Y'
Where x.UniqProducer IN (SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM')
The first query returns 0 rows while the second query returns 2 rows.The only difference is x.UniqProducer = versus x.UniqProducer IN for the last where clause.
Thanks for your time

SELECT TOP 1 doesn't guarantee that the same record will be returned each time.
Add an ORDER BY to your select to make sure the same record is returned.
(SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM' ORDER BY idl.UniqEntity)

I would guess (with strong emphasis on the word “guess”) that the reason is based on how equals and in are processed by the query engine. For equals, SQL knows it needs to do a comparison with a specific value, where for in, SQL knows it needs to build a subset, and find if the "outer" value is in that "inner" subset. Yes, the end results should be the same as there’s only 1 row returned by the subquery, but as #RickS pointed out, without any ordering there’s no guarantee of which value ends up “on top” – and the (sub)query plan used to build the in - driven subquery might differ from that used by the equals pull.
A follow-up question: which is the correct dataset? When you analyze the actual data, should you have gotten zero, two, or a different number of rows?

SSRS 2008 R2 / SQL - How to filter groups but keep detail data?

EDIT - i'm reposting this question in an attempt to explain what i mean better
I'm using SQL 2008 R2 and I work for a retail department store and we need a report to show all the sales orders made in each department, and sections of those departments.
What i want is to group up all the sales order lines by department and section, but remove only the sections that have a total sales value of less than £50. I still want to see order lines that are over £50, though.
Here is an example of what i currently have:
Data before filtering
I want to remove the Accessories section and all lines contained within it, as it has a total section value of less than £50. So i would want it looking like this after filtering:
Data after filtering
Here is my code:
SELECT department.department_name
,section.section_name
,sales_order_detail.sales_order_number
,sales_order_detail.sales_order_line
,LineValue
FROM
sales_order_detail INNER JOIN stock_item ON sales_order_detail.stock_item_code = stock_item.stock_item_code
INNER JOIN style ON stock_item.style_code = style.style_code
INNER JOIN department ON style.dept_code = department.department_code
INNER JOIN section ON style.section_code = section.section_code AND style.dept_code = section.department_code AND department.department_code = section.department_code
Can you please explain all the ways this can be done. I've tried using GROUP BY and HAVING but that then filters out all my sales order lines. I've tried using a Group Filter in the visual studio report design surface which removes the lines but then aggregates calculated at the Department group scope don't take into account the lines removed at the section level.
I appreciate any help i can get on this.
Jacob

As you are using 2008R2, you can use the magic that are Windowed Functions to calculate the total of the group that the row belongs to (the partition part of the over clause below) and then wrap your query into a filtering select statement. Not having your data this is obviously not tested, but it should work:
select department_name
,section_name
,sales_order_number
,sales_order_line
,LineValue
,GroupTotal
from(
select d.department_name
,se.section_name
,sod.sales_order_number
,sod.sales_order_line
,sod.qty_ordered * sod.selling_price AS LineValue
,sum(sod.qty_ordered * sod.selling_price) over (partition by d.department_name
,se.section_name
) as GroupTotal
from sales_order_detail sod
inner join stock_item si
on sod.stock_item_code = si.stock_item_code
inner join style s
on stock_item.style_code = s.style_code
inner join department d
on s.dept_code = d.department_code
inner join section se
on s.section_code = se.section_code
and s.dept_code = se.department_code
and d.department_code = se.department_code
) a
where GroupTotal > 50

Complicated Calculation Using Oracle SQL

I have created a database for an imaginary solicitors, my last query to complete is driving me insane. I need to work out the total a solicitor has made in their career with the company, I have time_spent and rate to multiply and special rate to add. (special rate is a one off charge for corporate contracts so not many cases have them). the best I could come up with is the code below. It does what I want but only displays the solicitors working on a case with a special rate applied to it.
I essentially want it to display the result of the query in a table even if the special rate is NULL.
I have ordered the table to show the highest amount first so i can use ROWNUM to only show the top 10% earners.
CREATE VIEW rich_solicitors AS
SELECT notes.time_spent * rate.rate_amnt + special_rate.s_rate_amnt AS solicitor_made,
notes.case_id
FROM notes,
rate,
solicitor_rate,
solicitor,
case,
contract,
special_rate
WHERE notes.solicitor_id = solicitor.solicitor_id
AND solicitor.solicitor_id = solicitor_rate.solicitor_id
AND solicitor_rate.rate_id = rate.rate_id
AND notes.case_id = case.case_id
AND case.contract_id = contract.contract_id
AND contract.contract_id = special_rate.contract_id
ORDER BY -solicitor_made;
Query:
SELECT *
FROM rich_solicitors
WHERE ROWNUM <= (SELECT COUNT(*)/10
FROM rich_solicitors)

I'm suspicious of your use of ROWNUM in your example query...
Oracle9i+ supports analytic functions, like ROW_NUMBER and NTILE, to make queries like your example easier. Analytics are also ANSI, so the syntax is consistent when implemented (IE: Not on MySQL or SQLite). I re-wrote your query as:
SELECT x.*
FROM (SELECT n.time_spent * r.rate_amnt + COALESCE(spr.s_rate_amnt, 0) AS solicitor_made,
n.case_id,
NTILE(10) OVER (ORDER BY solicitor_made) AS rank
FROM NOTES n
JOIN SOLICITOR s ON s.solicitor_id = n.solicitor_id
JOIN SOLICITOR_RATE sr ON sr.solicitor_id = s.solicitor_id
JOIN RATE r ON r.rate_id = sr.rate_id
JOIN CASE c ON c.case_id = n.case_id
JOIN CONTRACT cntrct ON cntrct.contract_id = c.contract_id
LEFT JOIN SPECIAL_RATE spr ON spr.contract_id = cntrct.contract_id) x
WHERE x.rank = 1
If you're new to SQL, I recommend using ANSI-92 syntax. Your example uses ANSI-89, which doesn't support OUTER JOINs and is considered deprecated. I used a LEFT OUTER JOIN against the SPECIAL_RATE table because not all jobs are likely to have a special rate attached to them.
It's also not recommended to include an ORDER BY in views, because views encapsulate the query -- no one will know what the default ordering is, and will likely include their own (waste of resources potentially).

you need to left join in the special rate.
If I recall the oracle syntax is like:
AND contract.contract_id = special_rate.contract_id (+)
but now special_rate.* can be null so:
+ special_rate.s_rate_amnt
will need to be:
+ coalesce(special_rate.s_rate_amnt,0)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas