SQL Join When Dates Codes are Involved

SQL Join When Dates Codes are Involved - sql

I was curious about something. Let's say I have two tables, one with sales and promo codes, the other with only promo codes and attributes. Promo codes can be turned on and off, but promo attributes can change. Here is the table structure:
tblSales tblPromo
sale promo_cd date promo_cd attribute active_dt inactive_dt
2 AAA 1/1/2013 AAA "fun" 1/1/2013 1/1/3001
3 AAA 6/2/2013 BBB "boo" 1/1/2013 6/1/2013
8 BBB 2/2/2013 BBB "green" 6/2/2013 1/1/3001
9 BBB 2/3/2013
10 BBB 8/1/2013
Please note, this is not my table/schema/design. I don't understand why they don't just make new promo_cd's for each change in attribute, especially when attribute is what we want to measure. Anyway, I'm trying to make a table that looks like this:
sale promo_cd attribute
2 AAA fun
3 AAA fun
8 BBB boo
9 BBB boo
10 BBB green
The only thing I have done so far is just create an inner join (which causes duplicate records) and then filter by comparing the sale date to the promo active/inactive dates. Is there a better way to do this, though? I was really curious since this is a pretty big set of data and I'd love to keep it efficient.

This is one of those cases where I like to put the filtering conditions right into the JOIN clause. At least in my brain, the duplicate records never make it into the result set. That leaves the WHERE clause for actual filtering conditions.
Select s.sale, s.promo_cd, p.attribute
From tblSales s
Inner Join tblPromo p
on s.promo_cd=p.promo_cd
and s.date between p.active_dt and p.inactive_dt

Assuming I understand you correctly, you can use:
SELECT s.sale, s.promo_cd, p.attribute
FROM tblSales s
JOIN tblPromo p ON p.promo_cd = s.promo_cd AND s.date BETWEENp.active_dt and p.inactive_dt
This assumes that tblPromo dates will never overlap (which seems likely given the schema they chose)

Just add the date to your JOIN criteria:
SELECT a.sale, a.promo_cd, b.attribute
FROM tblSales a
JOIN tblPromo b
ON a.promo_cd = b.promo_cd
AND a.date BETWEEN b.active_dt AND b.inactive_dt
Demo: SQL Fiddle

Related

SQL with Bob and John owing each other money

I have got the following 3 fields in a file: person_ows person_is_owed amount
Example content:
Bob John 100
John Bob 110
What does a SQL look like that produces:
Bob John 100 110
John Bob 110 100
Sorry if this is a trivial question, but I am just trying to learn SQL and I find it really like HELL!

So, what you need is to be able to JOIN two rows. In this case you'll probably want an OUTER JOIN assuming that there isn't always a match of each owing the other. Now you just need to come up with your JOIN criteria, which in this case is going to be based on the names (person_owes and person_is_owed):
SELECT
T1.person_owes,
T1.person_is_owed,
T1.amount AS owes_amount,
COALESCE(T2.amount, 0) AS is_owed_amount
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON T2.person_is_owed = T1.person_owes
The COALESCE is just to make sure that when there is no match that you get a value of 0 instead of NULL.
Also, this assumes that there is only going to be one of each combination of person_owes and person_is_owed. If you might have two rows showing that John owes Bill two different amounts of money then you would have to adjust the SQL above and it would be a bit more complex.
If you plan to use SQL much then you should invest the time in reading one (or preferably more) beginning books on the subject.

Assuming that the combination of (person_ows, person_is_owed) is unique
select person_ows,
person_is_owed,
amount,
(select t2.amount
from the_table t2
where (t2.person_ows, t2.person_is_owed) = (t1.person_is_owed, t1.person_ows))
from the_table t1

Left Join not returning results when a joni condition fails

After reading a lot about many other people having left join issues and missing rows, I still have not come to a conclusion on why my row is not showing up.
SELECT UNIT_MAIN.UNIT_NO
,F_CARD.CARD_NO
,F_CARD.END_DT
FROM UNIT_MAIN
LEFT JOIN F_CARD
ON UNIT_MAIN.UNIT_ID = F_CARD.ASSIGNED_ID
AND ((F_CARD.CARD_NO)<>'9' & [unit_no])
AND (F_CARD.CARD_NO)<> Replace(LTrim(Replace([unit_no],'0',' ')),' ','0')
AND ((F_CARD.END_DT) Is Null))
WHERE UNIT_MAIN.UNIT_NO = '555'
'555' is just the value I enter to attempt to find the missing row (I know which row isn't appearing, I can't figure out why). I took out all other where clauses and it still isn't appearing.
I am trying to obtain the UNIT_NO in question along with the F_CARD.CARD_NO and F_CARD.END_DT if applicable. The F_CARD table contains two entries that match on UNIT_ID, however both do not have a null value in F_CARD.END_DT, so the join condition fails.
Does this not mean that the UNIT_NO should still appear as the left result of a left join?
The result I would want looks like:
555 | null | null
But it just doesn't show up. If I remove the F_CARD.END_DT IS NULL condition from the join, then I get
555 | 123 | Jan 2015
555 | 234 | Feb 2015
Thanks!

The method I used to deal with this issue was to use subqueries for all join conditions that didn't require matching a column value between two tables.
I don't use MS Access much, but having to deal with the brackets on my joins and this problem involving the join conditions is quite frustrating.

SSRS query and WHERE with multiple

Being new with SQL and SSRS and can do many things already, but I think I must be missing some basics and therefore bang my head on the wall all the time.
A report that is almost working, needs to have more results in it, based on conditions.
My working query so far is like this:
SELECT projects.project_number, project_phases.project_phase_id, project_phases.project_phase_number, project_phases.project_phase_header, project_phase_expensegroups.projectphase_expense_total, invoicerows.invoicerow_total
FROM projects INNER JOIN
project_phases ON projects.project_id = project_phases.project_id
LEFT OUTER JOIN
project_phase_expensegroups ON project_phases.project_phase_id = project_phase_expensegroups.project_phase_id
LEFT OUTER JOIN
invoicerows ON project_phases.project_phase_id = invoicerows.project_phase_id
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total >0 )
The parameter is for selectionlist that is used to choose a project to the report.
How to have also records that have
( project_phase_expensegroups.projectphase_expense_total ) with value 0 but there might be invoices for that project phase?
Tried already to add another condition like this:
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total > 0 )
OR
( invoicerows.invoicerow_total > 0 )
but while it gives some results - also the one with projectphase_expense_total with value 0, but the report is total mess.
So my question is: what am I doing wrong here?

There is a core problem with your query in that you are left joining to two tables, implying that rows may not exist, but then putting conditions on those tables, which will eliminate NULLs. That means your query is internally inconsistent as is.
The next problem is that you're joining two tables to project_phases that both may have multiple rows. Since these data are not related to each other (as proven by the fact that you have no join condition between project_phase_expensegroups and invoicerows, your query is not going to work correctly. For example, given a list of people, a list of those people's favorite foods, and a list of their favorite colors like so:
People
Person
------
Joe
Mary
FavoriteFoods
Person Food
------ ---------
Joe Broccoli
Joe Bananas
Mary Chocolate
Mary Cake
FavoriteColors
Person Color
------ ----------
Joe Red
Joe Blue
Mary Periwinkle
Mary Fuchsia
When you join these with links between Person <-> Food and Person <-> Color, you'll get a result like this:
Person Food Color
------ --------- ----------
Joe Broccoli Red
Joe Bananas Red
Joe Broccoli Blue
Joe Bananas Blue
Mary Chocolate Periwinkle
Mary Chocolate Fuchsia
Mary Cake Periwinkle
Mary Cake Fuchsia
This is essentially a cross-join, also known as a Cartesian product, between the Foods and the Colors, because they have a many-to-one relationship with each person, but no relationship with each other.
There are a few ways to deal with this in the report.
Create ExpenseGroup and InvoiceRow subreports, that are called from the main report by a combination of project_id and project_phase_id parameters.
Summarize one or the other set of data into a single value. For example, you could sum the invoice rows. Or, you could concatenate the expense groups into a single string separated by commas.
Some notes:
Please, please format your query before posting it in a question. It is almost impossible to read when not formatted. It seems pretty clear that you're using a GUI to create the query, but do us the favor of not having to format it ourselves just to help you
While formatting, please use aliases, Don't use full table names. It just makes the query that much harder to understand.

You need an extra parentheses in your where clause in order to get the logic right.
WHERE ( projects.project_number = #iProjectNumber )
AND (
(project_phase_expensegroups.projectphase_expense_total > 0)
OR
(invoicerows.invoicerow_total > 0)
)
Also, you're using a column in your WHERE clause from a table that is left joined without checking for NULLs. That basically makes it a (slow) inner join. If you want to include rows that don't match from that table you also need to check for NULL. Any other comparison besides IS NULL will always be false for NULL values. See this page for more information about SQL's three value predicate logic: http://www.firstsql.com/idefend3.htm
To keep your LEFT JOINs working as you intended you would need to do this:
WHERE ( projects.project_number = #iProjectNumber )
AND (
project_phase_expensegroups.projectphase_expense_total > 0
OR project_phase_expensegroups.project_phase_id IS NULL
OR invoicerows.invoicerow_total > 0
OR invoicerows.project_phase_id IS NULL
)

I found the solution and it was kind easy after all. I changed the only the second LEFT OUTER JOIN to INNER JOIN and left away condition where the query got only results over zero. Also I used SELECT DISTINCT
Now my report is working perfectly.

Query Design, Strange Results - New to SQL-Server

I've written a query that's producing ghost records. Here's the statements which produce correct results on one table JOINed to a second table to grab the student's LAST_ATTEND_DATE, notice the LAST_ATTEND_DATE won't display, commented out:
SELECT DISTINCT TOP 500
SAC.STC_PERSON_ID AS CCID#,
SAC.STC_COURSE_NAME AS CourseName,
SAC.STC_TITLE AS Title,
SAC.STC_VERIFIED_GRADE AS Grade,
--CONVERT(varchar(10),SCS.SCS_LAST_ATTEND_DATE,101) AS LastAttended,
SAC.STC_REPORTING_TERM AS Term,
SAC.STC_ACAD_LEVEL AS AcadLevel
FROM STUDENT_ACAD_CRED SAC
JOIN STUDENT_COURSE_SEC SCS ON SAC.STC_PERSON_ID = SCS.SCS_STUDENT
WHERE (SAC.STC_ACAD_LEVEL = 'UG') AND (SCS.SCS_LAST_ATTEND_DATE IS NOT NULL)
ORDER BY SAC.STC_PERSON_ID;
This produces what I need except I need to display in the resulting data the students Last Attended Date. If I un-comment the statement above to display the LAST_ATTEND_DATE, 4 records appear in which 2 are ghost records. For example student ID = '0000002', he took English 1010 once in the Fall of 1992, made a D, then retook the course again in the Fall of 1993 and made a B.
0000002 ENGL*1010 English I D 92/FA UG
0000002 ENGL*1010 English I B 93/FA UG
With the LAST_ATTEND_DATE statement (CONVERT(varchar(10),SCS.SCS_LAST_ATTEND_DATE,101) AS LastAttended) un-commented to display the date, then 3 additional records appear...
I've tried changing the query between the 2 tables from JOIN, to LEFT JOIN, FULL JOIN and RIGHT JOIN. I always get 3 additional records that don't exist.
0000002 ENGL*1010 English I B 01/19/1995 93/FA UG
0000002 ENGL*1010 English I B 07/18/1996 93/FA UG
0000002 ENGL*1010 English I B 09/25/1992 93/FA UG
0000002 ENGL*1010 English I D 01/19/1995 92/FA UG
0000002 ENGL*1010 English I D 07/18/1996 92/FA UG
Would anyone know the correct syntax to JOIN these 2 tables correctly to display the data correctly?
Thanks so much for sharing your knowledge,
Donald, Casper College

Most likely the Student_Course_Sec table contains more than one record per student, which your join statement is not accounting for.
For example, if the SCS table consists of:
SCS_Student SCS_CourseName SCS_LastAttendDate
1 English 1/1/2014
1 Calculus 2/1/2014
2 English 3/1/2014
2 Philsolphy 4/1/2014
And your SAC table consists of:
STC_PERSON_ID STC_COURSE_NAME etc.
1 English
1 Calculus
2 English
2 Philosophy
then when you SELECT * FROM SAC JOIN SCS ON SAC.STS_PERSON_ID = SCS.SCS_STUDENT, your result set looks like this:
(row) STC_ID STC_Course SCS_ID SCS_Course SCS_Date
1 1 English 1 English 1/1/2014
2 1 English 1 Calculus 2/1/2014
3 1 Calculus 1 English 1/1/2014
4 1 Calculus 1 Calculus 2/1/2014
5 2 English 2 English 3/1/2014
6 2 English 2 Philosophy 4/1/2014
7 2 Philosophy 2 English 3/1/2014
8 2 Philosophy 2 Philosophy 4/1/2014
Your WHERE clause then filters out all the rows where STC_COURSE is not "English", leaving you with 4 rows (row numbers 1,2,5,6) instead of just the 2 you really want (rows 1 and 5). (And, because you're not reporting any of the other fields, it just looks like "phantom records" appear out of nowhere.)
To fix it, you need additional conditions on your JOIN specifying what else besides the ID needs to match up. In my contrived case, you would need to say
JOIN STUDENT_COURSE_SEC SCS on SAC.STS_PERSON_ID = SCS.SCS_STUDENT and SAC.STC_COURSE_NAME = SCS.SCS_COURSE_NAME, selecting only rows where both the student and the course are a proper match.

The 'ghost' records are actually the true result set. The reason that they don't display when you comment out SCS.SCS_LAST_ATTEND_DATE is that you are creating duplicate records since the date is the only differentiator, and your DISTINCT is suppressing the duplicates.
If you remove the DISTINCT, and leave SCS.SCS_LAST_ATTEND_DATE commented out, you should then get the same number of rows as when you uncomment the date.
Playing around with the JOIN types implies that you don't really know what you are trying to query. As #MarkD said in the comments, we would need to see your data model in order to help you further.

Sql views vs jdbc select-join, where to abstract?

I have 3 tables (see below), Table A describes a product, Table B holds inventory information for different dates, and Table C holds the price of each product for different dates.
Table A
------------------
product_id product_name
1 book
2 pencil
3 stapler
... ...
Table B
------------------
product_id date_id quantity
1 2012-12-01 100
1 2012-12-02 110
1 2012-12-03 90
2 2012-12-01 98
2 2012-12-02 50
... ... ...
Table C
-------------------
product_id date_id price
1 2012-12-01 10.29
1 2012-12-02 12.12
2 2012-12-02 32.98
3 2012-12-01 10.12
In many parts of my java application I would like to know what the dollar-value of each of the product is so I end up doing the following query
select
a.product_name,
b.date_id,
b.quantity * c.price as total
from A a
join B b on a.product_id = b.product_id
join C c on a.product_id = c.product_id and b.date_id = c.date_id
where b.date_id = ${date_input}
I had an idea today that I could make the query above be a view (minus the date condition), then query the view for a specific date so my queries would look like
select * from view where date_id = ${date_input}
I'm not sure where the appropriate level of abstraction for such logic is. Should it be in java code (read from a pref file), or encoded into a view in the database?
The only reason I don't want to put it as a view is that as time goes by the join will become expensive as there will be more and more dates to cover, and I'm usually only interested in the past month's worth of data. Perhaps a stored proc is better? Would that be a good place to abstract this logic?

If views are implemented correctly you should never see worst performance in a case like this where the query would be the same without the view. More dates will not affect the performance because you have this view.
Make the view, it is the correct abstraction in this case.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas