Tricky SQL I cant get...Dates and NULLs - sql

Imagine a table that has some line items
LineItemID CountryID Date
1 China 6/26/2011
2 China 6/27/2011
3 US 3/21/2011
I also have a table that has some rates so:
CountryID ExchangeRateDate ExchangeRateDateTo Rate
US 1/1/2011 NULL 1
China 6/1/2011 6/13/2011 6.06
China 6/13/2011 6/26/2011 6.13
China 6/26/2011 NULL 6.26
Notice the rate for the US doesnt change its simply a rate of 1 with a NULL for the ExchangeRateDateTo. I can join the tables by the countryID no problem, my issue is for instance how can I join not only the country ID but also to use the FirstTable's date with the 2nd tables ExchangeRateDate/To to get the correct date.
For instance, I cannot say
WHERE FirstTable.[Date] BETWEEN SecondTable.ExchangeRateDate AND SecondTable.ExchangeRateDateTo
Because for instance China's rate becomes null (starts from 6/26/2011 till NULL).
So basically I am looking for a way such that I get the result for instance of china using the rate of 6.26 because the dates are from 6/26-6/27. The rate of 6.13 ended right on 6/26, so it should pickup the new rate.
So my join would be to countryID plus using the date range to pick up the right rate, otherwise if i only join by the countryid you can see that that would yield a cartesian.

Use a sentinel value
WHERE
FirstTable.[Date]
BETWEEN SecondTable.ExchangeRateDate
AND ISNULL(SecondTable.ExchangeRateDateTo, '99991231')
You can also do this in the JOIN too
FROM
FirstTable F
JOIN
SecondTable S ON F.[Date] BETWEEN S.ExchangeRateDate AND ISNULL(S.ExchangeRateDateTo, '99991231')
COALESCE is more portable but has side effects around datatype precedence.
It's acceptable to store 9991231 as the ExchangeRateDateTo date: may not be "correct" but it simplifies code and JOINs.
Edit: to work around incorrect ranges, use a non-inclusive comparison
Assuming the FromDate is the start of the range...
FROM
FirstTable F
JOIN
SecondTable S ON F.[Date] >= S.ExchangeRateDate AND
F.[Date] < ISNULL(S.ExchangeRateDateTo, '99991231')
Change to > a nd <= to make it more confusing if required

For your ExchangeRateDateTo comparisons, just use:
COALESCE(ExchangeRateDateto, GETDATE())
...which will check for either a non-NULL date OR the current date if the date field is NULL.

Related

Duplicate rows because 1 column has multiple distinct values

I'm running a SELECT query to get data across multiple tables in the same server instance. However I've just noticed that the rows pulled on some data get duplicated because the main table I'm pulling from has a few different values in one of the columns. Here's the query:
SELECT DISTINCT BIF030.C_ACCOUNT AS ACCOUNTNUMBER,
BIF003.C_ACCOUNTTYPE AS ACCOUNTTYPECODE,
CON013.C_DESCRIPTION AS ACCOUNTTYPE,
BIF003.C_DIVISION AS ZONE_DIVISONCODE,
CON028.C_DESCRIPTION AS ZONE_DIVISION,
BIF030.C_METER as METERNUMBER,
BIF005.C_METERCUSTOM1 AS REGISTERNUMBER,
CONVERT(DECIMAL(20,2), BIF030.N_CONSUMP) AS CONSUMPTION,
CON007.C_DESCRIPTION AS UNITS,
BIF030.T_READDATE AS READINGDATE,
MONTH(BIF030.T_READDATE) AS READINGMONTH,
DAY(BIF030.T_READDATE) AS READINGDAY,
YEAR(BIF030.T_READDATE) AS READINGYEAR,
BIF030.I_DAYS AS READINGDAYSCOUNT
FROM ADVANCED.BIF030
LEFT JOIN ADVANCED.CON007 ON CON007.C_UNITS=BIF030.C_UNITS
LEFT JOIN ADVANCED.BIF005 ON BIF005.C_METER=BIF030.C_METER
LEFT JOIN ADVANCED.BIF003 ON BIF003.C_ACCOUNT=BIF030.C_ACCOUNT
LEFT JOIN ADVANCED.CON013 ON CON013.C_ACCOUNTTYPE=BIF003.C_ACCOUNTTYPE
LEFT JOIN ADVANCED.CON028 ON CON028.C_DIVISION=BIF003.C_DIVISION
WHERE T_READDATE > '01-01-2014'
ORDER BY ACCOUNTNUMBER, READINGDATE ASC
I know SELECT DISTINCT is frowned upon, but I get even more rows without it. Here's a sample of what the data looks like when pulled:
ACCOUNTNUMBER
ACCOUNTTYPECODE
ACCOUNTTYPE
ZONE_DIVISIONCODE
ZONE_DIVISION
METERNUMBER
REGISTERNUMBER
CONSUMPTION
UNITS
READINGDATE
READINGMONTH
READINGDAY
READINGYEAR
READINGDAYSCOUNT
1234567
SP
ACCOUNT TYPE 1
00
00-NO ZONE
123456789
987654321
3.00
Thousands of Gallons
2014-01-16 00:00:00.00
1
16
2014
30
1234567
MF
ACCOUNT TYPE 2
02
02-GRAVITY
123456789
987654321
3.00
Thousands of Gallons
2014-01-16 00:00:00.00
1
16
2014
30
1234567
SR
ACCOUNT TYPE 3
02
02-GRAVITY
123456789
987654321
3.00
Thousands of Gallons
2014-01-16 00:00:00.00
1
16
2014
30
I also know the column that is messing this up is the "AccountTypeCode" because other accounts that don't have multiple codes associated with the "AccountNumber" only show 1 set of rows. So this one specifically (and probably others) is tripling the amount of rows pulled when it should only pull one for each "ReadingDate".
Also if anyone knows a good way to optimize the query I'd be happy to learn. I know just enough SQL to be dangerous, but not enough to figure this out. Thanks.
Ok. So good news and I want to add this in case it helps anyone else in the future. I found out that since the ACCOUNTTYPECODE and ZONE_DIVISIONCODE were coming from the table BIF003 I needed to add more in the WHERE statement. This is what fixed it for me:
AND BIF030.C_CUSTOMER = BIF003.C_CUSTOMER
Because the C_CUSTOMER column was different (it's a column in the BIF003 and BIF030 tables) which lead to the separate ACCOUNTTYPECODE results I need to check it in the WHERE statement.
Thanks everyone for kick starting my brain on this one.

Compare a date with a block of dates in SQL

i tried to ready a lot of date comparisons that i found here on stackoverflow and spread into the internet but i wasn't able to find the solution.
I have the following table (Trips):
VehicleID DriverID xID CheckIn CheckOut DateHour
462 257 7 1 0 16/12/2017 20:40:00
462 257 7 0 1 19/12/2017 10:05:00
5032 3746 11 1 0 02/10/2017 07:00:00
5032 3746 11 0 1 06/10/2017 17:00:00
When my company receives a traffic ticket, i want to compare the date from the ticket with the hole block of dates from the table "Trips", each block starts with CheckIn = 1 and finishes with CheckOut = 1, so this way i will know which driver was responsable for the ticket through the DriverID.
For example: the traffic ticket date and time are: 17/12/2017 08:00:00 and the Vehicle is the one with id = 462, i'll insert this date and time in a field in our system to consult automaticaly which driver was driving that car at that moment, we won´t use the ticket table yet. Looking at my example, i know it should return DriverID = 257, but theres a lot of trips with the same vehicle and diferent drivers.....The major problem is how can i compare the Date and Hour from the Ticket with the range of dates from the trips, since i have to consider 1 trip = 2 lines in the table
Unfortunately i can't change the way this table was created, cause we need this 2 lines, CheckIn and CheckOut, separately.
Any thoughts or directions?
Thank you for your attention
select t1.VehicleID
,t1.DriverID
,t1.xID
,t1.DateHour as Checkin
,t2.DateHour as Checkout
from trips as t1 join trips as t2 --self join trips to get both start and end in a single row
on t1.VehicleID = t2.VehicleID -- add all columns
and t1.DriverID = t2.DriverID -- which define
and t1.xID = t2.xID -- a unique trip
and t1.Checkin = 1 -- start
and t2.Checkout = 1 -- end
join tickets -- now join tickets
on tickets.trafficDateHour between t1.DateHour and t2.DateHour
I didn't make sample tables, this will not run as is, but something like this should do it for you:
SELECT *
FROM tickets, trips
WHERE
trips.datehour in (
SELECT trips.datehour
FROM tickets, trips
WHERE
tickets.ticket_date < trips.datehour AND
trips.checkin = 0
) AND
tickets.ticket_date > trips.datehour AND
trips.checkin = 1
If you are running this for a specific date as described in the comment above, it will work. If you are trying to run it for a set of ticket dates all at once, you'll require recursion. Recursion is a different beast depending on your flavor of SQL.

SQL with Bob and John owing each other money

I have got the following 3 fields in a file: person_ows person_is_owed amount
Example content:
Bob John 100
John Bob 110
What does a SQL look like that produces:
Bob John 100 110
John Bob 110 100
Sorry if this is a trivial question, but I am just trying to learn SQL and I find it really like HELL!
So, what you need is to be able to JOIN two rows. In this case you'll probably want an OUTER JOIN assuming that there isn't always a match of each owing the other. Now you just need to come up with your JOIN criteria, which in this case is going to be based on the names (person_owes and person_is_owed):
SELECT
T1.person_owes,
T1.person_is_owed,
T1.amount AS owes_amount,
COALESCE(T2.amount, 0) AS is_owed_amount
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON T2.person_is_owed = T1.person_owes
The COALESCE is just to make sure that when there is no match that you get a value of 0 instead of NULL.
Also, this assumes that there is only going to be one of each combination of person_owes and person_is_owed. If you might have two rows showing that John owes Bill two different amounts of money then you would have to adjust the SQL above and it would be a bit more complex.
If you plan to use SQL much then you should invest the time in reading one (or preferably more) beginning books on the subject.
Assuming that the combination of (person_ows, person_is_owed) is unique
select person_ows,
person_is_owed,
amount,
(select t2.amount
from the_table t2
where (t2.person_ows, t2.person_is_owed) = (t1.person_is_owed, t1.person_ows))
from the_table t1

SQL - Selecting Records with an odd number of a given attribute

I'm just brushing up on some SQL - in other words, I'm really rusty - and am a bit stuck at the moment. It's probably something trivial, but we'll see.
I'd like to select all people that possess an odd number of a certain attribute that isn't an integer ( in this example, TransactionType). So, for example, take the following test/not real info where these people are buying a car or some similarly big purchase.
Name TransactionType Date
John Buy 5/1
John Cancel 5/1
John Buy 5/2
Joseph Buy 5/25
Joseph Cancel 5/25
Tanya Buy 5/28
I would like it to return the people who had an odd number of transactions; in other words, they ended up purchasing the item. So, in this case, John and Tanya would be selected and Joseph would not.
I know I can use the modulus operand here, but I'm a bit lost how to utilize it correctly.
I thought of using
count(TransactionType) % 2 != 0
in the where clause but that's obviously a no-go. Any pointers in the right direction would be very helpful. Let me know if this is unclear, and thanks!
You are close. You need a having clause instead of a where clause.
select Name
from table
group by Name
having count(TransactionType) % 2 != 0
Wouldn't you be better off getting the latest status by the transaction date and using that rather than relying on counting TransactionType to determine the latest status:
Something like this:
SELECT b.Name, b.TransactionType, b.[Date]
FROM (
SELECT Name, MAX(t1.[DATE]) latestDate
FROM [Transactions] t1
GROUP BY t1.Name
) a
INNER JOIN [Transactions] b ON b.Name = a.Name AND a.latestDate = b.[Date]
WHERE b.TransactionType = 'Buy'
Assuming your dates are valid dates with times included, this should work.
Sample SQL Fiddle
If you only store the date portion the max date would be the same for people that Buy and Cancel on the same date, therefore it would return more data and some incorrect records.

SQL SUM with Repeating Sub Entries - Best Practice?

I hit this issue regularly but here is an example....
I have a Order and Delivery Tables. Each order can have one to many Deliveries.
I need to report totals based on the Order Table but also show deliveries line by line.
I can write the SQL and associated Access Report for this with ease ....
SELECT xxx
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
until I get to the summing element. I obviously only want to sum each Order once, not the 1-many times there are deliveries for that order.
e.g. The SQL might return the following based on 2 Orders (ignore the banalness of the report, this is very much simplified)
Region OrderNo Value Delivery Date
North 1 £100 12-04-2012
North 1 £100 14-04-2012
North 2 £73 01-05-2012
North 2 £73 03-05-2012
North 2 £73 07-05-2012
South 3 £50 23-04-2012
I would want to report:
Total Sales North - £173
Delivery 12-04-2012
Delivery 14-04-2012
Delivery 01-05-2012
Delivery 03-05-2012
Delivery 07-05-2012
Total Sales South - £50
Delivery 23-04-2012
The bit I'm referring to is the calculation of the £173 and £50 which the first of which obviously shouldn't be £419!
In the past I've used things like MAX (for a given Order) but that seems like a fudge.
Surely there must be a regular answer to this seemingly common problem but I can't find one.
I don't necessarily need the code - just a helpful point in the right direction.
Many thanks,
Chris.
A roll up operator may not look pretty. However, it would do the regular aggregates that you see now, and it show the subtotals of the order. This is what you're looking for.
SELECT xxx
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
GROUP BY xxx
WITH ROLLUP;
I'm not exactly sure how the rest of your query is set up, but it would look something like this:
Region OrderNo Value Delivery Date
North 1 £100 12-04-2012
North 1 £100 14-04-2012
North 2 £73 01-05-2012
North 2 £73 03-05-2012
North 2 £73 07-05-2012
NULL NULL f419 NULL
I believe what you want is called a windowing function for your aggregate operation. It looks like the following:
SELECT xxx, SUM(Value) OVER (PARTITION BY Order.Region) as OrderTotal
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
Here's the MSDN article. The PARTITION BY tells the SUM to be done separately for each distinct Order.Region.
Edit: I just noticed that I missed what you said about orders being counted multiple times. One thing you could do is SUM() the values before joining, as a CTE (guessing at your schema a bit):
WITH RegionOrders AS (
SELECT Region, OrderNo, SUM(Value) OVER (PARTITION BY Region) AS RegionTotal
FROM Order
)
SELECT Region, OrderNo, Value, DeliveryDate, RegionTotal
FROM RegionOrders RO
INNER JOIN Delivery D on D.OrderNo = RO.OrderNo