How to combine taking several joins and add constraints on query? - sql

How can I answer the following question by quering this database:
The police is looking for a brown hair coloured woman that checked in in the gym somewhere between september 8th 2016 and october 24th 2016.This woman has a silver membership. Can we find the name of this woman?
I tried the following query:
dbGetQuery(db,"
SELECT *
FROM get_fit_now_member
JOIN get_fit_now_check_in ON id = membership_id
WHERE check_in_date BETWEEN '20160909' AND '20161023' AND membership_status = 'silver'
")
This gives me the following output:
The problem is that I have to join multiple times and at the same time have to add different constrains. How can I solve this question in a clever way?

Here's how I would write the query:
SELECT m.name
FROM get_fit_now_member AS m
JOIN get_fit_now_check_in AS c ON m.id = c.membership_id
JOIN person AS p ON m.person_id = p.id
JOIN drivers_license AS d ON p.license_id = d.id
WHERE c.check_in_date BETWEEN '20160908' AND '20161024'
AND m.membership_status = 'silver'
AND d.hair_color = 'brown';
JOIN is just an operator, like + is in arithmetic. In arithmetic, you can extend the expressions with more terms, like a + b + c + d. In SQL, you can use JOIN multiple times in a similar way.
I used correlation names (m, c, p, d) to make it more convenient to qualify the table names, so I can be clear for example which id I mean in each join condition, since a column named id exists in multiple tables.
I also changed the date expression, because I assume "between" is meant to include the two dates named in the problem statement.

Related

Select only the last date when two columns are duplicate

I need to select seven columns from three different tables, only when one of the columns has a particular value. I also need to select only the last date when two columns (TAGNAME and TAGNUMMER) are both duplicate. I'm using the following code:
select c.AKEY, c.AKT_DATUM, c.TAGNAME, c.TAGNUMMER,
cd.TEILANLAGEN_ID, x.TP_GSAP_KZ, c.KLASSEN_ID
from T0EM01 c, T0EM03 x, T0AD07 cd
where cd.TEILANLAGEN_ID = '219A'
inner join
(select c.TAGNAME and c.TAGNUMMER max(C.AKT_DATUM)
where T0EM01 c c.TAGNAME and T0EM01 c c.TAGNUMMER = m.max_date
Up to where cd.TEIANLAGEN_ID = '219A' it works fine (but there are over 2 million rows).
How can I filter so that when both TAGNAME and TAGNUMMER are repeated in two or more rows I only select the latest date?
"Over 2 million rows" could be less if you properly joined those 3 tables. The way you put it, you're producing Cartesian join and got way too many rows.
from t0em01 c,
t0em03 x,
t0ad07 cd
I have no idea how are they to be joined to each other so I'm just guessing; you should know.
As of the "max date value", one option might be to use a subquery, also properly joined to other table(s). Once again, I don't know how exactly to join them.
Improve it:
select c.akey,
c.akt_datum,
c.tagname,
c.tagnummer,
cd.teilanlagen_id,
x.tp_gsap_kz,
c.klassen_id
from t0em01 c join t0em03 x on x.id = c.id --> I'm just
join t0ad07 cd on cd.id = c.id -- guessing here
where cd.teilanlagen_id = '219A'
and c.akt_datum = (select max(c1.akt_datum) --> subquery, to return
from t0em01 c1 -- only the MAX date value
where c1.tagname = c.tagname
and c1.tagnummer = c.tagnummer
);

How do I make a query shorter and neater?

Im trying to make This query more understandable and neater. But im not sure how to?
SELECT a.Patient_id, COUNT (p.Person_id) AS "Number of Operations", SUM (w.Daily_charge * (a.Discharge_date - a.Admission_date) + ot.Theatre_fee + b.Charges + c.Charges ) AS "Total Payment"
FROM person p, admission a, ward w, operation o, operation_type ot, staff b, staff c
WHERE w.Ward_code = a.Ward_code AND p.Person_id = a.Patient_id
AND a.Admission_id = o.Admission_id AND ot.Op_code = o.Actual_op
AND o.Surgeon = b.Person_id AND o.Anaesthetist = c.Person_id
GROUP BY a.Patient_id, p.Person_id
ORDER BY COUNT (p.Person_id) DESC FETCH FIRST 1 ROWS ONLY;
Any decent formatter would do it for you.
Other than that,
JOIN instead of comma-separate tables in the FROM clause
remove p.person_id from group by clause, there's no use of it as it is
equal to a.patient_id which is correctly put into the clause,
not part of the select statement's column list
So:
select a.patient_id,
count (p.person_id) as "number of operations",
sum (w.daily_charge * (a.discharge_date - a.admission_date) +
ot.theatre_fee + b.charges + c.charges
) as "total payment"
from person p join admission a on a.patient_id = p.person_id
join ward w on w.ward_code = a.ward_code
join operation o on o.admission_id = a.admission_id
join operation_type ot on ot.op_code = o.actual_op
join staff b on o.surgeon = b.person_id
join staff c on o.anaesthetist = c.person_id
group by a.patient_id
order by count (p.person_id) desc
fetch first 1 rows only;
Sadly there is no perfect formatter, although as Ed mentioned in the comments, they can be a start if you review the settings carefully. (It's a tradition in the industry that the default formatter settings are always horrible.)
It's also been said (I think by Steven Feuerstein) that you should only set formatting rules that are supported by your formatter, and of course he makes a good point. But taken with the limitations of all formatters, an industry tradition for horrible formatting and the impossibility of consistent rules for formatting SQL anyway, that puts us PL/SQL developers in a difficult position.
I'd say the first principle of computer code layout is to use vertically aligned blocks to indicate dependency levels (similar to the grids used in graphic design). A lot of the choices then become about how to apply that principle.
We then need to separate the code into logical sections, but at the same time not let it sprawl down the page unnecessarily. I think this is difficult for automated formatters as the rules become a bit fuzzy, e.g. for a join with only one condition I keep it on one line, but if there is more than one I start splitting it out onto multiple lines, one per on or and keyword. The same goes for your complex sum() expression - normally I would place it all on one line, but if it aids readability then I split it up.
Finally, opinions vary on where to place commas in stacked lists, of which SQL has a lot. I say they go on the left, to act like bullet points and also make it easier to add items to the ends of lists. Others will disagree.
select ad.patient_id
, count(*) as "Number of Operations"
, sum(
wa.daily_charge * (ad.discharge_date - ad.admission_date)
+ ot.theatre_fee + ss.charges + sa.charges
) as "Total Payment"
from person pr
join admission ad on ad.patient_id = pr.person_id
join ward wa on wa.ward_code = ad.ward_code
join operation op on op.admission_id = ad.admission_id
join operation_type ot on ot.op_code = op.actual_op
join staff ss on ss.person_id = op.surgeon
join staff sa on sa.person_id = op.anaesthetist
group by ad.patient_id
order by count(*) desc
fetch first row only;

How can I access a selected column from my first select-statement in my third-level subslect?

I have a table "Bed" and a table "Component". Between those two I have a m:n relation and the table "BedComponent", where I store the Bed-ID and the Component-ID.
Every Component has a price. And now I want to write a select-statement that gives me the sum of prices for a certain bed.
This is what I have:
SELECT Bed.idBed, Bed.name, SUM(src.price) AS summe, Bed.idCustomer
FROM Bed,
(SELECT price
FROM dbo.Component AS C
WHERE (C.idComponent IN
(SELECT idComponent
FROM dbo.BedComponent AS BC
WHERE 1 = BC.idBed))) AS src
GROUP BY dbo.Bed.idBed, dbo.Bed.name, dbo.Bed.idCustomer;
This statement works. But of course I don't want to write the bed-ID hard coded into my select as it will always calculate the price for bed 1. Instead of the "1" i want to have the current bed-id.
I work with MS SQL Server
Thanks for your help.
I think you want:
select b.idBed, b.name, SUM(src.price) AS summe, b.idCustomer
from bed b join
bedcomponent bc
on b.idBed = bc.idBed join
component c
on c.idComponent = bc.idComponent
group by b.idBed, b.name, b.idCustomer;
The idCustomer looks strange to me in the select and group by, but I don't know what you are trying to achieve.
Also note the use of table aliases, which make the query easier to write and to read.

Include missing years in Group By query

I am fairly new in Access and SQL programming. I am trying to do the following:
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
and group by year even when there is no amount in some of the years. I would like to have these years listed as well for a report with charts. I'm not certain if this is possible, but every bit of help is appreciated.
My code so far is as follows:
SELECT
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
FROM
Base_CustomerT
INNER JOIN (
SO_SalesOrderPaymentHistoryLineT
INNER JOIN SO_SalesOrderT
ON SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId
) ON Base_CustomerT.CustomerId = SO_SalesOrderT.CustomerId
GROUP BY
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
SO_SalesOrderPaymentHistoryLineT.PaymentType,
Base_CustomerT.IsActive
HAVING
(((SO_SalesOrderPaymentHistoryLineT.PaymentType)=1)
AND ((Base_CustomerT.IsActive)=Yes))
ORDER BY
Base_CustomerT.SalesRep,
Base_CustomerT.Customer;
You need another table with all years listed -- you can create this on the fly or have one in the db... join from that. So if you had a table called alltheyears with a column called y that just listed the years then you could use code like this:
WITH minmax as
(
select min(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as minyear,
max(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as maxyear)
from SalesOrderPaymentHistoryLineT
), yearsused as
(
select y
from alltheyears, minmax
where alltheyears.y >= minyear and alltheyears.y <= maxyear
)
select *
from yearsused
join ( -- your query above goes here! -- ) T
ON year(T.SO_SalesOrderPaymentHistoryLineT.DatePaid) = yearsused.y
You need a data source that will provide the year numbers. You cannot manufacture them out of thin air. Supposing you had a table Interesting_year with a single column year, populated, say, with every distinct integer between 2000 and 2050, you could do something like this:
SELECT
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
Sum(NZ(data.Amount)) AS [Sum Of PaymentPerYear]
FROM
(SELECT * FROM Base_CustomerT INNER JOIN Year) AS base
LEFT JOIN
(SELECT * FROM
SO_SalesOrderT
INNER JOIN SO_SalesOrderPaymentHistoryLineT
ON (SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId)
) AS data
ON ((base.CustomerId = data.CustomerId)
AND (base.year = Year(data.DatePaid))),
WHERE
(data.PaymentType = 1)
AND (base.IsActive = Yes)
AND (base.year BETWEEN
(SELECT Min(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT)
AND (SELECT Max(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT))
GROUP BY
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
ORDER BY
base.SalesRep,
base.Customer;
Note the following:
The revised query first forms the Cartesian product of BaseCustomerT with Interesting_year in order to have base customer data associated with each year (this is sometimes called a CROSS JOIN, but it's the same thing as an INNER JOIN with no join predicate, which is what Access requires)
In order to have result rows for years with no payments, you must perform an outer join (in this case a LEFT JOIN). Where a (base customer, year) combination has no associated orders, the rest of the columns of the join result will be NULL.
I'm selecting the CustomerId from Base_CustomerT because you would sometimes get a NULL if you selected from SO_SalesOrderT as in the starting query
I'm using the Access Nz() function to convert NULL payment amounts to 0 (from rows corresponding to years with no payments)
I converted your HAVING clause to a WHERE clause. That's semantically equivalent in this particular case, and it will be more efficient because the WHERE filter is applied before groups are formed, and because it allows some columns to be omitted from the GROUP BY clause.
Following Hogan's example, I filter out data for years outside the overall range covered by your data. Alternatively, you could achieve the same effect without that filter condition and its subqueries by ensuring that table Intersting_year contains only the year numbers for which you want results.
Update: modified the query to a different, but logically equivalent "something like this" that I hope Access will like better. Aside from adding a bunch of parentheses, the main difference is making both the left and the right operand of the LEFT JOIN into a subquery. That's consistent with the consensus recommendation for resolving Access "ambiguous outer join" errors.
Thank you John for your help. I found a solution which works for me. It looks quiet different but I learned a lot out of it. If you are interested here is how it looks now.
SELECT DISTINCTROW
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
FROM
Base_Customer_RevenueYearQ
LEFT JOIN CustomerPaymentPerYearQ
ON (Base_Customer_RevenueYearQ.RevenueYear = CustomerPaymentPerYearQ.[RevenueYear])
AND (Base_Customer_RevenueYearQ.CustomerId = CustomerPaymentPerYearQ.CustomerId)
GROUP BY
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
;

Sorting rows by count of a many-to-many associated record

I know there are a lot of other SO entries that seem like this one, but I haven't found one that actually answers my question so hopefully one of you can either answer it or point me to another SO question that is related.
Basically, I have the following query that returns Venues that have any CheckIns that contain the searched Keyword ("foobar" in this example).
SELECT DISTINCT v.*
FROM "venues" v
INNER JOIN "check_ins" c ON c."venue_id" = v."id"
INNER JOIN "keywordings" ks ON ks."check_in_id" = c."id"
INNER JOIN "keywords" k ON ks."keyword_id" = k."id"
WHERE (k."name" = 'foobar')
I want to SELECT and ORDER BY the count of the matched Keyword for each given Venue. E.g. if there have been 5 CheckIns that have been created, associated with that Keyword, then there should be a returned column (called something like keyword_count) with the value 5 which is sorted.
Ideally this should be done without any queries in the SELECT clause, or preferably none at all.
I've been struggling with this for a while and my mind is just going blank (perhaps it's been too long a day) so some help would be greatly appreciated here.
Thanks in advance!
Sounds like you need something like:
SELECT v.x, v.y, count(*) AS keyword_count
FROM "venues" v
INNER JOIN "check_ins" c ON c."venue_id" = v."id"
INNER JOIN "keywordings" ks ON ks."check_in_id" = c."id"
INNER JOIN "keywords" k ON ks."keyword_id" = k."id"
WHERE (k."name" = 'foobar')
GROUP BY v.x, v.y
ORDER BY 3