T-SQL programming - sql

I need to write a query to pull out the records as below.
membership_id,
person_id,
first_name
last_name
who joined or called yesterday and the members who didn't join will have only person_id
But the below query is pulling out all the records from the table.
SELECT
dbo.pn_membership.membership_id,
dbo.pn.person_id,
dbo.pn.first_name,
dbo.pn.surname,
dbo.pn.create_datetime
FROM
dbo.pn
LEFT OUTER JOIN
dbo.pn_membership
ON dbo.pn.person_id=dbo.pn_membership.person_id AND
dbo.pn.create_datetime >=getdate()-1
I need the records only for the day before the run date.

Try this version.
Your write in your question "who joined or called yesterday"
but your query does "who joined or called in the last 24 hours"
which is kind of different. Also, as Sparky noted you had this lack
of WHERE clause problem. My version does "who joined or called yesterday".
SELECT
dbo.pn_membership.membership_id,
dbo.pn.person_id,
dbo.pn.first_name,
dbo.pn.surname,
dbo.pn.create_datetime
FROM
dbo.pn
LEFT OUTER JOIN
dbo.pn_membership
ON dbo.pn.person_id=dbo.pn_membership.person_id
WHERE
dbo.pn.create_datetime >= DATEADD(Day, DATEDIFF(Day, 0, getdate()), -1)
AND
dbo.pn.create_datetime < DATEADD(Day, DATEDIFF(Day, 0, getdate()), 0)

are you saying that it pulls records with a date < getdate()-1 or that it pulls records where
person_id is Null ?
if the latter try this
SELECT dbo.pn_membership.membership_id, dbo.pn.person_id, dbo.pn.first_name,
dbo.pn.surname, dbo.pn.create_datetime FROM dbo.pn LEFT OUTER JOIN dbo.pn_membership ON
dbo.pn.person_id=dbo.pn_membership.person_id AND dbo.pn.create_datetime >=getdate()-1
and dbo.pn.person_id is not NULL

Try this:
SELECT
dbo.pn_membership.membership_id,
dbo.pn.person_id,
dbo.pn.first_name,
dbo.pn.surname,
dbo.pn.create_datetime
FROM
dbo.pn
LEFT OUTER JOIN
dbo.pn_membership
ON dbo.pn.person_id=dbo.pn_membership.person_id
WHERE dbo.pn.create_datetime >=getdate()-1
Your query says...
Give me some fields from the pn table.
Also, if the person has matching membership record, give me that
information
if they don't give me the fields from the membership table with NULL
values
By moving the date test condition to the WHERE clause, you are reducing the rows from the pn table. By applying date as part of the JOIN, you are only increasing the likelihood of getting more NULL value columns from the membership table..

It looks like the problem is at the very end of your query. Instead of AND dbo.pn.create_datetime >=getdate()-1, try WHERE dbo.pn.create_datetime >=getdate()-1. Including your filter criteria as part of the OUTER JOIN statement isn't the same thing as using a WHERE clause. See SQL Standard Regarding Left Outer Join and Where Conditions also.

Related

Full Outer Join Not including all records from both sides

I have a calendar table with dates from 2000 through 2029. I have done a full outer join with my data table on date=tdate. My goal is to count trips occurring by date and to show a zero if there are no trips. However, all dates are not showing up. If there is no trip date in the data table, the date from the Calendar table does not show up at all. I've never seen this happen before. I've tried changing it to a left outer join which didn't work either. Actual query below.
SELECT DISTINCT c.DATE,
COALESCE(COUNT(t.TripID), 0) AS tripCount
FROM dbo.Calendar c
FULL OUTER JOIN dbo.TripsMade t ON c.DATE = CONVERT(DATE, t.tripdate, 126)
WHERE (c.DATE BETWEEN '2019-09-01' AND '2019-09-30' )
AND (t.compid = 270 OR t.compid IS NULL)
GROUP BY c.DATE
ORDER BY c.DATE
If I select all dates from the Calendar Table, they are all there. There are dates missing from the Data Table, and always will have some missing. I need all dates to show up on this report and show a zero if there are no associated records on the Data Table side.
What am I missing here?
Goal
DATE TOTAL_TRIPS
2019-09-01 3
2019-09-02 5
2019-09-03 0 <== This row is currently missing completely
2019-09-04 4
2019-09-05 0 <== This row is currently missing completely
2019-09-06 9
Your WHERE clause ends up filtering non-matches out from both tables, turning them almost into INNER JOINs (it is a little complicated with the condition on t.
I don't think you need a FULL JOIN. I suspect you want a LEFT JOIN with filtering (for the second table) in the ON clause:
SELECT c.Date, COUNT(t.TripID) AS tripCount
FROM dbo.Calendar c LEFT OUTER JOIN
dbo.TripsMade t
ON c.date = CONVERT(DATE, t.tripdate, 126) AND
t.compid = 270
WHERE c.Date BETWEEN '2019-09-01' AND '2019-09-30'
GROUP BY c.Date
ORDER BY c.Date;
If compid can actually take on NULL values and you want them, then use (t.compid = 270 OR t.compid IS NULL). I suspect, though, that is not what you want. The NULL just represents a non-match on the JOIN.
Notes:
SELECT DISTINCT is almost never used with GROUP BY, and it is not needed in this case.
COUNT() does not return NULL, so there is no need for COALESCE().
Filtering on the first table does belong in the WHERE clause.

Impala SQL LEFT ANTI JOIN

Goal is to find the empid's for a given timerange that are present in LEFT table but not in RIGHT table.
I have the following two Impala queries which I ran and got different results?
QUERY 1: select count(dbonetable.empid), COUNT(DISTINCT dbtwotable.empid) from
(select distinct dbonetable.empid
from dbonedbtable dbonetable
WHERE (dbonetable.expiration_dt >= '2009-01-01' OR dbonetable.expiration_dt IS NULL) AND dbonetable.effective_dt <= '2019-01-01' AND dbonetable.empid IS NOT NULL) dbonetable
LEFT join dbtwodbtable dbtwotable ON dbonetable.empid = dbtwotable.empid
--43324489 43270569
QUERY 2: select count(*) from (
select distinct dbonetable.empid from dbonedbtable dbonetable
LEFT ANTI join dbtwodbtable dbtwotable ON dbonetable.empid = dbtwotable.empid
AND (dbonetable.expiration_dt >= '2009-01-01' OR dbonetable.expiration_dt IS NULL) AND dbonetable.effective_dt <= '2019-01-01' AND dbonetable.empid IS NOT NULL) tab
--19088973
--For LEFT ANTI JOIN, this clause returns those values from the left-hand table that have no matching value in the right-hand table.
To explain the Context,
Query 2: Trying to find all the empid's that are in dbonetable and are not in dbtwotable using LEFT ANTI JOIN which I learned from here:
https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_joins.html
--For LEFT ANTI JOIN, this clause returns those values from the left-hand table that have no matching value in the right-hand table.
And in Query 1:
The dbOnetable calculated based on where clause and results from it are LEFT OUTER joined with dbtwotable, And on top of that result, I am doing a count(dbonetable.empid) and COUNT(DISTINCT dbtwotable.empid) which gave me a result as --43324489 43270569, which means 53,920.
My question either my Query 1 result should be 43324489 -43270569 = 53,920 or my Query 2 Result should be 19088973.
what could be missing here, is my Query 1 is incorrect? Or is my LEFT ANTI JOIN is misleading?
Thank you all in Advance.
It's different because you forgot specifying "where dbtwotable.empid is null" in the query 1
Additionally, your query 2 is logically different from query 1 because in query 1, you join only on equivalence of empid1 and empid2, while in query 2 your join has much more conditions, so the tables have much fewer common entries compared to query 1, and as a result, the final count is much larger.
If you make a join condition in query 2 the same as in query 1 and put everything else into where clause, you will get the same count that you got in query 1 (updated) which is 53920. That's the count you need

Do I use inner join or left join?

I have two separate tables I'm pulling data from with the associate_id being the primary key. I'm trying to find all sales(sales_charge found in sales.dim) made by Associate_ID over several transactions within the last 4 months and the last year. I'm having a hard time with the time stamp and the joins.
Here's what I have so far:
SELECT associate_id
, sales.dim.sales_charge
FROM dbo.associate
LEFT JOIN dbo.sales ahd
ON associate_id = ahd.associate_id
AND ahd.end_dt > GETDATE()
I'm new to SQL and coding in general, please let me know what I'm missing.
Thanks
If you want to include all associates, even those with no sales, then use left join:
SELECT a.associate_id, ahd.dim.sales_charge
FROM dbo.associate a LEFT JOIN
dbo.sales ahd
ON a.associate_id = ahd.associate_id AND
ahd.end_dt > DATEADD(month, -4, GETDATE());
Answer to the question "Do I use inner join or left join?" is that you use inner join when you want to include only matching records from both tables, whereas left (outer) join will include all the records from left side table.
In the query you are attempting if you want to have all associates included in the result set even if they do not have any sales in last 4 months, use LEFT JOIN. If you want to have only those associates who have one or more sales then use INNER JOIN.
Another problem is with the condition "ahd.end_dt > GETDATE()". This means all end date after current time. Change it to "ahd.end_dt > DATEADD(month, -4, GETDATE())"

SQL Query to count the records

I am making up a SQL query which will get all the transaction types from one table, and from the other table it will count the frequency of that transaction type.
My query is this:
with CTE as
(
select a.trxType,a.created,b.transaction_key,b.description,a.mode
FROM transaction_data AS a with (nolock)
RIGHT JOIN transaction_types b with (nolock) ON b.transaction_key = a.trxType
)
SELECT COUNT (trxType) AS Frequency, description as trxType,mode
from CTE where created >='2017-04-11' and created <= '2018-04-13'
group by trxType ,description,mode
The transaction_types table contains all the types of transactions only and transaction_data contains the transactions which have occurred.
The problem I am facing is that even though it's the RIGHT join, it does not select all the records from the transaction_types table.
I need to select all the transactions from the transaction_types table and show the number of counts for each transaction, even if it's 0.
Please help.
LEFT JOIN is so much easier to follow.
I think you want:
select tt.transaction_key, tt.description, t.mode, count(t.trxType)
from transaction_types tt left join
transaction_data t
on tt.transaction_key = t.trxType and
t.created >= '2017-04-11' and t.created <= '2018-04-13'
group by tt.transaction_key, tt.description, t.mode;
Notes:
Use reasonable table aliases! a and b mean nothing. t and tt are abbreviations of the table name, so they are easier to follow.
t.mode will be NULL for non-matching rows.
The condition on dates needs to be in the ON clause. Otherwise, the outer join is turned into an inner join.
LEFT JOIN is easier to follow (at least for people whose native language reads left-to-right) because it means "keep all the rows in the table you have already read".

Correct SQL query statement requiring JOIN and UNION

I was trying to solve this question (No. 29) on http://www.sql-ex.ru/
Under the assumption that the income (inc) and expenses (out) of the
money at each outlet are written not more than once a day, get a
result set with the fields: point, date, income, expense. Use Income_o
and Outcome_o tables.
And came up with this solution
SELECT Income_o.point, Income_o.date, Income_o.inc, Outcome_o.out
FROM Income_o
INNER JOIN Outcome_o ON Income_o.point = Outcome_o.point
The result is obviously wrong (and hence my question here). It assumes that a point will never have more than 1 income and expense, so isn't this query correct? I can see from the same page that the correct query has some NULL column values. I would appreciate an explanation (if not the correct answer). My SQL is not a master one (and that's why I am trying to go through those!! So far done 29 out of 125 and only took help from SO on 3 of them)
The expected result is (From the website):
The result of correct query:
A snapshot of the expected result is here - http://snag.gy/yN43V.jpg
P.S. I know that the hint says UNION and JOIN and trying to get my head around this. If I can get the answer myself, I will post it.
You want a full outer join on point and date:
SELECT
COALESCE(i.point, o.point) AS point,
COALESCE(i.date, o.date) AS date,
i.inc,
o.out
FROM
Income_o AS i
FULL JOIN Outcome_o AS o ON i.point = o.point AND i.date = o.date
;
The COALESCE expressions ensure that NULL is not returned for those columns: if the Income_o side has a NULL (because the table has no match for an Outcome_o row), the value is then taken from the other side.
Alternatively you can go with a union of two outer joins, left and right:
SELECT
i.point,
i.date,
i.inc,
o.out
FROM
Income_o AS i
LEFT JOIN Outcome_o AS o ON i.point = o.point AND i.date = o.date
UNION
SELECT
o.point,
o.date,
i.inc,
o.out
FROM
Income_o AS i
RIGHT JOIN Outcome_o AS o ON i.point = o.point AND i.date = o.date
;
If the tables have matches on the specified condition, both joins will return them, but UNION will eliminate duplicate entries. This second method is essentially an alternative implementation of full outer join, useful for cases where the FULL JOIN syntax is not supported. (MySQL is one product that does not support FULL JOIN.)
You can use Group By with Aggregate Function to achieve the desired result, the sub query will combine the result set but will give results as per date, if there are two (in and out transaction) on same date, these will appear as two rows, to make it one row we can use Group By with aggregate function.
select point, date, max(inc), max(out)
from
(
select point, date, inc, NULL as out
from income_o
union all
select point, date, NULL, out
from outcome_o
)
dt
group by point, date
I think you are looking for LEFT JOIN
SELECT Income_o.point, Income_o.dat, Income_o.inc, Outcome_o.outc
FROM Income_o
LEFT JOIN Outcome_o ON Income_o.point = Outcome_o.point
try this SQL Fiddle example