SQL MAX: max date from multiple locations same part - sql

what I'm looking to find is that last or max date a part number was purchased from any store. so we can have so sales or sales and just give the max date:
part
date
loc
123
8/1/2022
store 1
123
8/2/2022
store 1
123
null
store 2
123
8/3/2022
store 3
result would be:
part
date
Loc
123
8/3/2022
store 1
123
8/3/2022
store 2
123
8/3/2022
store 3

Select the max date in a subquery for every part, it would give you one Result, the highest date.
The Query should work with most rdms
SELECT DISTINCT [part], (SELECT MAX([date]) FROM Table1 WHERE part = t1.part) [Date],[loc] FROM Table1 t1
part | Date | loc
---: | :------- | :------
123 | 8/3/2022 | store 1
123 | 8/3/2022 | store 2
123 | 8/3/2022 | store 3
db<>fiddle here

I am sure there is a more efficient way to do the query but I used a subquery. this should get you the desired result
SELECT DISTINCT m.[part], ad.x AS 'date', m.[loc]
FROM [MainTable] AS 'm'
LEFT OUTER JOIN
(SELECT MAX([date]) AS 'x', [part]
FROM [MainTable]
GROUP BY [part]) AS 'ad'
WHERE m.[part] = 123 --desired value
nbk's answer
There is the cleaner query.

Related

Alias Reference Date_Diff Days. Need to Parse or create temp table with dates?

Below I have the tables and query which output the below
Table1
EmployeeID | StartDateTimestamp | CohortID | CohortName
---------- | ------------------ | -------- | ----------
1 | 20080101 01:30:00 | 1 | Peanut
1 | 20090204 01:01:00 | 2 | Apple
2 | 20190107 05:52:14 | 1 | Peanut
3 | 20190311 02:35:26 | 2 | Apple
Employee
EmployeeID | HireStartName | StartDateTimestamp2
---------- | ------------- | -------------------
1 | HiredStart | 20080501 01:30:00
1 | DeferredStart | 20090604 01:01:00
2 | HiredStart | 20190115 05:52:14
3 | HiredStart | 20190330 02:35:26
Query
select
t.cohortid,
min(e.startdatetimestamp2) first,
max(e.startdatetimestamp2) last
from table1 t
inner join employee e on e.employeeid = t.employeeid
group by t.cohort_id
Output
ID | FIRST | LAST
1 |20190106 12:00:05 |20180214 03:45:12
2 |20180230 01:45:23 |20180315 01:45:23
My attempt:
SELECT DATE_DIFF(first, last, Day), ID, max(datecolumn1) first, min(datecolumn1) last
Error: Unrecognized name.
How do I enter the reference alias first and last in a Date_Diff?
Do I need to derive a table?
Clarity: Trying to avoid inputting in the dates, since I am looking to find the date diff of both first and last columns for as many rows as there is data.
This answer has been discussed here: Date Difference between consecutive rows
DateDiff has deprecated, and now it is Date_Diff (first, last, day)
Then I tried:
SELECT ID, DATE_DIFF(PARSE_DATE('%y%m%d',t.first), PARSE_DATE('%y%m%d',t.last), DAY) days
FROM table
Failed to parse input string "20180125 01:00:05"
Tried this
SELECT CohortID, date_diff(first,last,day) as days
FROM (select cohortid,min(startdatetimestamp2) first,
max(startdatetimestamp2) last
FROM employee
JOIN table1 on table1.employeeid = employee.employeeid
group by cohortid)
I get days not found on either side of join
Regarding your first question about using aliases in a query, there are some restriction where to use them, specially in the FROM, GROUP BY and ORDER BY statements. I encourage you to have a look here to check these restrictions.
About your main issue, obtaining the date difference between two dates. I would like to point that your timestamp data, in both of your tables, are actually considered as DATETIME format in BigQuery. Therefore, you should use DATETIME builtin functions to get the desired results.
The below query uses the data you provided to obtain the aimed output.
WITH
data AS
(
SELECT
t.cohortid AS ID,
PARSE_DATETIME('%Y%m%d %H:%M:%S', MIN(e.startdatetimestamp2)) AS first,
PARSE_DATETIME('%Y%m%d %H:%M:%S', MAX(e.startdatetimestamp2)) AS last
FROM
`test-proj-261014.sample.table1` t
INNER JOIN
`test-proj-261014.sample.employee` e
ON
e.employeeid = t.employeeid
GROUP BY t.cohortid
)
SELECT
ID,
first,
last,
DATETIME_DIFF(last, first, DAY ) AS diff_days
FROM
data
And the output:
Notice that I created a temp table to format the fields StartDateTimestamp and StartDateTimestamp2, using the PARSE_DATETIME(). Afterwards, I used the DATETIME_DIFF() method to obtain the difference in days between the two fields.

Calculate time span over a number of records

I have a table that has the following schema:
ID | FirstName | Surname | TransmissionID | CaptureDateTime
1 | Billy | Goat | ABCDEF | 2018-09-20 13:45:01.098
2 | Jonny | Cash | ABCDEF | 2018-09-20 13:45.01.108
3 | Sally | Sue | ABCDEF | 2018-09-20 13:45:01.298
4 | Jermaine | Cole | PQRSTU | 2018-09-20 13:45:01.398
5 | Mike | Smith | PQRSTU | 2018-09-20 13:45:01.498
There are well over 70,000 records and they store logs of transmissions to a web-service. What I'd like to know is how would I go about writing a script that would select the distinct TransmissionID values and also show the timespan between the earliest CaptureDateTime record and the latest record? Essentially I'd like to see what the rate of records the web-service is reading & writing.
Is it even possible to do so in a single SELECT statement or should I just create a stored procedure or report in code? I don't know where to start aside from SELECT DISTINCT TransmissionID for this sort of query.
Here's what I have so far (I'm stuck on the time calculation)
SELECT DISTINCT [TransmissionID],
COUNT(*) as 'Number of records'
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1
Not sure how to get the difference between the first and last record with the same TransmissionID I would like to get a result set like:
TransmissionID | TimeToCompletion | Number of records |
ABCDEF | 2.001 | 5000 |
Simply GROUP BY and use MIN / MAX function to find min/max date in each group and subtract them:
SELECT
TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime))
FROM yourdata
GROUP BY TransmissionID
HAVING COUNT(*) > 1
Use min and max to calculate timespan
SELECT [TransmissionID],
COUNT(*) as 'Number of records',datediff(s,min(CaptureDateTime),max(CaptureDateTime)) as timespan
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1
A method that returns the average time for all transmissionids, even those with only 1 record:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime)) * 1.0 / NULLIF(COUNT(*) - 1, 0)
FROM yourdata
GROUP BY TransmissionID;
Note that you may not actually want the maximum of the capture date for a given transmissionId. You might want the overall maximum in the table -- so you can consider the final period after the most recent record.
If so, this looks like:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second,
MIN(CaptureDateTime),
MAX(MAX(CaptureDateTime)) OVER ()
) * 1.0 / COUNT(*)
FROM yourdata
GROUP BY TransmissionID;

Combine two tables into a new one so that select rows from the other one are ignored

I have two tables that have identical columns. I would like to join these two tables together into a third one that contains all the rows from the first one and from the second one all the rows that have a date that doesn't exist in the first table for the same location.
Example:
transactions:
date |location_code| product_code | quantity
------------+------------------+--------------+----------
2013-01-20 | ABC | 123 | -20
2013-01-23 | ABC | 123 | -13.158
2013-02-04 | BCD | 234 | -4.063
transactions2:
date |location_code| product_code | quantity
------------+------------------+--------------+----------
2013-01-20 | BDE | 123 | -30
2013-01-23 | DCF | 123 | -2
2013-02-05 | UXJ | 234 | -6
Desired result:
date |location_code| product_code | quantity
------------+------------------+--------------+----------
2013-01-20 | ABC | 123 | -20
2013-01-23 | ABC | 123 | -13.158
2013-01-23 | DCF | 123 | -2
2013-02-04 | BCD | 234 | -4.063
2013-02-05 | UXJ | 234 | -6
How would I go about this? I tried for example this:
SELECT date, location_code, product_code, type, quantity, location_type, updated_at
,period_start_date, period_end_date
INTO transactions_combined
FROM ( SELECT * FROM transactions_kitchen k
UNION ALL
SELECT *
FROM transactions_admin h
WHERE h.date NOT IN (SELECT k.date FROM k)
) AS t;
but that doesn't take into account that I'd like to include the rows that have the same date, but different location. I have Postgresql 9.2 in use.
UNION simply doesn't do what you describe. This query should:
CREATE TABLE AS
SELECT date, location_code, product_code, quantity
FROM transactions_kitchen k
UNION ALL
SELECT h.date, h.location_code, h.product_code, h.quantity
FROM transactions_admin h
LEFT JOIN transactions_kitchen k USING (location_code, date)
WHERE k.location_code IS NULL;
LEFT JOIN / IS NULL to exclude rows from the second table for the same location and date. See:
Select rows which are not present in other table
Use CREATE TABLE AS instead of SELECT INTO. The manual:
CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO
is not available in ECPG or PL/pgSQL, because they interpret the
INTO clause differently. Furthermore, CREATE TABLE AS offers a
superset of the functionality provided by SELECT INTO.
Or, if the target table already exists:
INSERT INTO transactions_combined (<list names of target column here!>)
SELECT ...
Aside: I would not use date as column name. It's a reserved word in every SQL standard and a function and data type name in Postgres.
Change UNION ALL to just UNION and it should return only unique rows from each table.

SQL: Select distinct sum of column with max(column)

I have a salary table like this:
id | person_id | start_date | pay
1 | 1234 | 2012-01-01 | 3000
2 | 1234 | 2012-05-01 | 3500
3 | 5678 | 2012-01-01 | 5000
4 | 5678 | 2013-01-01 | 6000
5 | 9101 | 2012-09-01 | 2000
6 | 9101 | 2014-04-01 | 3000
7 | 9101 | 2011-01-01 | 1500
and so on...
Now I want to query the sum of the salaries of a specific month for all persons of a company.
I already have the ids of the persons who worked in the specific month in the specific company, so I can do something like WHERE person_id IN (...)
I have some problems with the salaries query though. The result for e.g. the month 2012-08 should be:
10000
which is 3500+5000+1500.
So I need to find the summed up pay value (for all persons in the IN clause) for the maximum start_date <= the specific month.
I tried various INNER JOINS but it's been a long day and I can't think straight at the moment.
Any hint is highly appreciated.
You need to get the active record. This following does this by calculating the max start date before the month in question:
select sum(s.pay)
from (select person_id, max(start_date) as maxstartdate
from salary
where person_id in ( . . . ) and
start_date < <first day of month of interest>
group by person_id
) p join
salary s
on s.person_id = p.person_id and
s.maxstartdate = p.start_date
You need to fill in the month and list of ids.
You can also do this with ranking functions, but you don't specify which SQL engine you are using.
You have to use group by for these things....
select person_id,sum(pay) from salary where person_id in(...) group by person_id
may it will helps you.....

MIN() Function in SQL

Need help with Min Function in SQL
I have a table as shown below.
+------------+-------+-------+
| Date_ | Name | Score |
+------------+-------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/05 | Jones | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/06 | James | 3 |
| 2012/07/07 | Hugo | 1 |
| 2012/07/07 | Jack | 1 |
| 2012/07/07 | Jim | 2 |
+------------+-------+-------+
I would like to get the output like below
+------------+------+-------+
| Date_ | Name | Score |
+------------+------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/07 | Hugo | 1 |
+------------+------+-------+
When I use the MIN() function with just the date and Score column I get the lowest score for each date, which is what I want. I don't care which row is returned if there is a tie in the score for the same date. Trouble starts when I also want name column in the output. I tried a few variation of SQL (i.e min with correlated sub query) but I have no luck getting the output as shown above. Can anyone help please:)
Query is as follows
SELECT DISTINCT
A.USername, A.Date_, A.Score
FROM TestTable AS A
INNER JOIN (SELECT Date_,MIN(Score) AS MinScore
FROM TestTable
GROUP BY Date_) AS B
ON (A.Score = B.MinScore) AND (A.Date_ = B.Date_);
Use this solution:
SELECT a.date_, MIN(name) AS name, a.score
FROM tbl a
INNER JOIN
(
SELECT date_, MIN(score) AS minscore
FROM tbl
GROUP BY date_
) b ON a.date_ = b.date_ AND a.score = b.minscore
GROUP BY a.date_, a.score
SQL-Fiddle Demo
This will get the minimum score per date in the INNER JOIN subselect, which we use to join to the main table. Once we join the subselect, we will only have dates with names having the minimum score (with ties being displayed).
Since we only want one name per date, we then group by date and score, selecting whichever name: MIN(name).
If we want to display the name column, we must use an aggregate function on name to facilitate the GROUP BY on date and score columns, or else it will not work (We could also use MAX() on that column as well).
Please learn about the GROUP BY functionality of RDBMS.
SELECT Date_,Name,MIN(Score)
FROM T
GROUP BY Name
This makes the assumption that EACH NAME and EACH date appears only once, and this will only work for MySQL.
To make it work on other RDBMSs, you need to apply another group function on the Date column, like MAX. MIN. etc
SELECT T.Name, T.Date_, MIN(T.Score) as Score FROM T
GROUP BY T.Date_
Edit: This answer is not corrected as pointed out by JNK in comments
SELECT Date_,MAX(Name),MIN(Score)
FROM T
GROUP BY Date_
Here I am using MAX(NAME), it will pick one name if two names were found with the same goal numbers.
This will find Min score for each day (no duplicates), scored by any player. The name that starts with Z will be picked first than the name that starts with A.
Edit: Fixed by removing group by name