SQL DateDiff Syntax - sql

I have a homework problem that I'm having a lot of trouble with... I don't expect the answer and I truly want to learn it. Could somebody help me out with the syntax?
Problem:
For each Sales Order, show how many days it took to ship the order in order by the longest order, then by Sales Order Number. Display Sales Order Number and the number of days to ship. Include the orders that have not yet shipped.
So far I have:
SELECT SalesOrder.SalesOrderNumber,
DATEDIFF (d, MIN(SalesOrder.OrderDate), MAX(Shipment.ShipmentDate)) AS "DaysToShip"
FROM SalesOrder, Shipment
GROUP BY SalesOrder.SalesOrderNumber;

Sometimes it's helpful to see an intermediate form of your query to evaluate if it's providing the correct data at some stage.
Consider the following query, pulled from your example minus some elements:
SELECT SalesOrder.SalesOrderNumber, SalesOrder.OrderDate, Shipment.ShipmentDate
FROM SalesOrder, Shipment
You should observe the results of this query and see how they differ from what you expect. In this case, you haven't indicated how SalesOrder and Shipment are related. The result will be many more rows than there are orders, with each SalesOrder related to each and every other Shipment record (a cross-join).
Once you provide the correct join condition and achieve the desired results at that stage, try adding in aggregation (GROUP BY, MIN, MAX) and test that form of your query. Finally, when you're convinced that you have the correct inputs, add in DATEDIFF and you'll have your final query.

SELECT SalesOrder.SalesOrderNumber,
DATEDIFF (d, MAX(SalesOrder.OrderDate), MAX(Shipment.ShipmentDate)) AS "DaysToShip"
FROM SalesOrder, Shipment
GROUP BY SalesOrder.SalesOrderNumber;

Related

Ms ACCESS and queries: dates in graph not in order

I use queries in Ms ACCESS to create graphs (shown in forms) to represent monthly spend data on a supplier. I want the x axis to be the months in chronological order, and this is where I'm having issues.
The picture above shows that the x axis starts with april 2016, although the earliest date is august 2015.
The query code that creates the graph is the following:
SELECT (Format([DateStamp],"mmm"" '""yy")) AS Expr1, Sum([Item Master].SpendPerMaterial) AS Expr2
FROM [Item Master]
WHERE ((([Item Master].SupplierName)=[Forms]![Supplier History]![List0]))
GROUP BY (Format([DateStamp],"mmm"" '""yy")), (Year([DateStamp])*12+Month([DateStamp])-1);
[Item Master] is the table were all data is retrieved from. DateStamp refers to the column with months, SpendPerMaterial is the spend of a certain material in that month (which is aggregated since we look at the supplier level, not the material level), and List0 is a list where users can select a supplier from a list of suppliers.
You should never rely on the ordering of results from a query unless you include an explicit order by. In your case, the results are ordered by the columns alphabetically (because of the group by).
You can fix this by adding:
order by max([DateStamp])
to the query.
I would add the following to your query, after your GROUP BY clause:
ORDER BY [datestamp] ASC;
I tried the other suggesions on an aggregate totals by month report and no luck. the only way i could get the actual month labels was by putting labels directly beneath the chart, which means altering it every month!

Query does not include the specified expression as part of an aggregate function in UNION query

I am doing a Union Query to add together the results of two separate queries that give me data from two different fiscal periods, to get a rolling 12 months number.
I get the message "Your query does not include the specified expression "Report_Header" as part of an aggregate function". I have read that the field needs to be included in a GROUP BY statement at the end, but when I add the field from either query or with both queries as shown below I still get the message. Help? I'm not a programmer, I'm an Access user, so I need to simple please :).
SELECT [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB]
UNION ALL
SELECT [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2]
GROUP BY [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header,
[JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header
Thanks!
You can aggregate both subqueries:
SELECT [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB]
GROUP BY [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB].Report_Header
UNION ALL
SELECT [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2]
GROUP BY [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header;
This may be what you want. But, it will not combine information under the same header from both tables. For that, the simplest method is probably a view.
Place GROUP BY [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header under the first query instead of the second.

Conditional Sum Based on Count

I have reviewed the various 'Conditional Sum' questions in the forum and none quite match what I'm trying to do:
In the database, there is Date, Store#, Item#, and %Total Sales. In some cases, the same item# for the same date may be given more than one value for %Total Sales. (For some reason this is a valid business scenario that happens rarely, but it happens.)
In that situation only, the requirement is to sum the two values together into one line. So if Item# 123 has a line with a value of .05%, and another line with a value of .08%, I need to sum those two values into one line for Item #123 that has a %Total of .13%. ONLY when an item has more than one percentage assigned, those percentages should be summed. Otherwise, if an item# has only one percentage value assigned, we just want to see that value.
I cannot figure out how to do this. Basically, I would like to implement logic that would work like this:
SELECT Date, Store#, Item#,
CASE WHEN Count(%Total Sales) >1 THEN Sum(%Total Sales)
ELSE %Total Sales
END
FROM (some tables and joins)
GROUP BY Date, Store#, Item#
However, I'm not sure how to craft it so that I don't get a syntax error (this query produces errors).
Any help would be appreciated.
Thank you!
Provided you group by Date, Store#, Item# the only values that return multiple lines are those with multiple values for Total Sales of the same item on the same date.
Therefore grouping the items should be sufficient.
SELECT Date, Store#, Item#, Sum(%Total Sales)
FROM (some tables and joins)
GROUP BY Date, Store#, Item#
Why don't you make an update instead of entering a new value in the table.
Rather than manipulating of the data after insertions.
I believe this is a better idea that you can check for the existing date and update the percentage at that moment itself.
Is it not possible?

Is there a way to handle immutability that's robust and scalable?

Since bigquery is append-only, I was thinking about stamping each record I upload to it with an 'effective date' similar to how peoplesoft works, if anybody is familiar with that pattern.
Then, I could issue a select statement and join on the max effective date
select UTC_USEC_TO_MONTH(timestamp) as month, sum(amt)/100 as sales
from foo.orders as all
join (select id, max(effdt) as max_effdt from foo.orders group by id) as latest
on all.effdt = latest.max_effdt and all.id = latest.id
group by month
order by month;
Unfortunately, I believe this won't scale because of the big query 'small joins' restriction, so I wanted to see if anyone else had thought around this use case.
Yes, adding a timestamp for each record (or in some cases, a flag that captures the state of a particular record) is the right approach. The small side of a BigQuery "Small Join" can actually return at least 8MB (this value is compressed on our end, so is usually 2 to 10 times larger), so for "lookup" table type subqueries, this can actually provide a lot of records.
In your case, it's not clear to me what the exact query you are trying to run is.. it looks like you are trying to return the most recent sales times of every individual item - and then JOIN this information with the SUM of sales amt per month of each item? Can you provide more info about the query?
It might be possible to do this all in one query. For example, in our wikipedia dataset, an example might look something like...
SELECT contributor_username, UTC_USEC_TO_MONTH(timestamp * 1000000) as month,
SUM(num_characters) as total_characters_used FROM
[publicdata:samples.wikipedia] WHERE (contributor_username != '' or
contributor_username IS NOT NULL) AND timestamp > 1133395200
AND timestamp < 1157068800 GROUP BY contributor_username, month
ORDER BY contributor_username DESC, month DESC;
...to provide wikipedia contributions per user per month (like sales per month per item). This result is actually really large, so you would have to limit by date range.
UPDATE (based on comments below) a similar query that finds "num_characters" for the latest wikipedia revisions by contributors after a particular time...
SELECT current.contributor_username, current.num_characters
FROM
(SELECT contributor_username, num_characters, timestamp as time FROM [publicdata:samples.wikipedia] WHERE contributor_username != '' AND contributor_username IS NOT NULL)
AS current
JOIN
(SELECT contributor_username, MAX(timestamp) as time FROM [publicdata:samples.wikipedia] WHERE contributor_username != '' AND contributor_username IS NOT NULL AND timestamp > 1265073722 GROUP BY contributor_username) AS latest
ON
current.contributor_username = latest.contributor_username
AND
current.time = latest.time;
If your query requires you to use first build a large aggregate (for example, you need to run essentially an accurate COUNT DISTINCT) another option is to break this query up into two queries. The first query could provide the max effective date by month along with a count and save this result as a new table. Then, could run a sum query on the resulting table.
You could also store monthly sales records in separate tables, and only query the particular table for the months you are interested in, simplifying your monthly sales summaries (this could also be a more economical use of BigQuery). When you need to find aggregates across all tables, you could run your queries with multiple tables listed after the FROM clause.

SQL query question

I'm trying to do something in a query that I've never done before. it probably requires variables, but i've never done that, and I'm not sure that it does.
What I want is to get a list of sales, grouped first by affiliate, then by it's month.
I can do that, but here's the twist... I don't want the month, but month 1, month 2, month 3...
And those aren't Jan, feb, march, but the number of months since the day of first sale.
Is this possible in a query at all, or do I need to do this in my code.
Oh, mysql 5.1.something...
Sure, just write an expression in SQL that generates the number of months since the first sale (Do you mean the first sale for that afiliate? If so, you'll need a subquery)
And since you say you want a list of sales, I assume you don't really want to "Group By" affilaite and monthcount, you just want to Sort, or Order By those values)
If you wanted the Average sales amount, or the Count of sales, or some other Aggregate function of sales data, then you would be doing a "Group By"...
And I don't think you need to worry about sorting by the number of months, you can simply sort by the difference between each sales date and the rearliest sale date for each affiliate. (If you wanted to apply a third sorting rule, after the sales date sort, then you would need to be more careful.)
Select * From Sales S
Order By Affiliate,
SalesDate - (Select Min(SalesDate)
From Sales
Where Affiliate = S.Affiliate)
Or, if you really need it to be by the difference in months
Select * From Sales S
Order By Affiliate,
Month(SalesDate) -
(Select Month(Min(SalesDate))
From Sales
Where Affiliate = S.Affiliate)
This is possible in standard SQL if you use what I like to call "SQL gymnastics". It can be done with subqueries.
But it looks incredibly ugly, is hard to maintain and it's really not worth it. You're far better off using one of the many programming languages that wrap SQL (such as PL/SQL) or even a general purpose language that can call SQL (such as Python).
The result will be in two languages but will be all the more understandable than the same thing written in just SQL.