Select largest date from column based on another column in table - sql

I'm new to SQL. Trying to get a certain date for jobs from a table. The only way to get these dates is to look to a massive table where every item for each job is stored with a last transaction date. The date I want is the largest date in the lst_trx_date column for each job.
The data in the table looks something like this:
Where each job has a varying amount of items. My biggest hurdle and my main question: How can I instead of selecting the entire job table only select the largest lst_trx_date for each job? I initially brought in the data using microsoft query, but I realize my request will probably require modifying the SQL command text directly.

Try something like this.. this will give you the max date
SELECT MAX (lst_trx_date) AS "Max Date"
FROM table where job = 1234;

To get the latest date for each job, you can use windowing functions. As an example try:
select job, item, lst_trx_date from (select job, item, lst_trx_date, row_number()
over(partition by stat,job,item order by
lst_trx_date desc) rn
from <table>)t
where rn = 1

I think it would be along these lines:
SELECT job, MAX(lst_trx_date) as job, last_transaction_date
FROM table
GROUP BY job
ORDER BY lst_trx_date DESC

Related

Query monitoring changes in the field

I need to program a query where I can see the changes that certain fields have undergone in a certain date period.
Example: From the CAM_CONCEN table bring those records where the ACCOUNT_NUMBER undergoes a modification in the CONCTACT field in a period of 6 months before the date.
I would be grateful if you can guide me.
You can use LAG() to peek at the previous row of a particular subset of rows (the same account in this case).
For example:
select *
from (
select c.*,
lag(contact) over(partition by account_number
order by change_date) as prev_contact
from cam_concen c
) x
where contact <> prev_contact

Bigquery - how to aggregate data based on conditions

I have a simple table like the following, which has product, price, cost and category. price and cost can be null.
And this table is being updated from time to time. Now I want to have a daily summary of the table content grouped by category, to see in each category, how many products that has no price, and how many has a price, and how many products has a price that is higher than the cost, so the result table would look like the following:
I think I can get a query running everyday by setting up query re-run schedule in bigQuery, so I can have three rows of data appended to the result table everyday.
But the problem is, how can I get those three rows? I know I can group by, but how do I get the count with those conditions like not null, larger than, etc.
You seem to want window functions:
select t.*
countif(price is nuill) over (partition by date) as products_no_price,
countif(price <= cost) over (partition by date) as products_price_lower_than_cost
from t;
You can run this code on the table that has date column. In fact, you don't need to store the last two columns.
If you want to insert the first table into the second, then there is no date and you can simply use:
select t.*
countif(price is nuill) over () as products_no_price,
countif(price <= cost) over () as products_price_lower_than_cost
from t;

Efficiently find last date in a table - Teradata SQL

Say I have a rather large table in a Teradata database, "Sales" that has a daily record for every sale and I want to write a SQL statement that limits this to the latest date only. This will not always be the previous day, for example, if it was a Monday the latest date would be the previous Friday.
I know I can get the results by the following:
SELECT s.*
FROM Sales s
JOIN (
SELECT MAX(SalesDate) as SalesDate
FROM Sales
) sd
ON s.SalesDate=sd.SalesDt
I am not knowledgable on how it would process the subquery and since Sales is a large table would there be a more efficient way to do this given there is not another table I could use?
Another (more flexible) way to get the top n utilizes OLAP-functions:
SELECT *
FROM Sales s
QUALIFY
RANK() OVER (ORDER BY SalesDate DESC) = 1
This will return all rows with the max date. If you want only one of them switch to ROW_NUMBER.
That is probably fine, if you have an index on salesdate.
If there is only one row, then I would recommend:
select top 1 s.*
from sales s
order by salesdate desc;
In particular, this should make use of an index on salesdate.
If there is more than one row, use top 1 with ties.

SQL Server: I have multiple records per day and I want to return only the first of the day

I have some records track inquires by DATETIME. There is an glitch in the system and sometimes a record will enter multiple times on the same day. I have a query with a bunch of correlated subqueries attached to these but the numbers are off because when there were those glitches in the system then these leads show up multiple times. I need the first entry of the day, I tried fooling around with MIN but I couldn't quite get it to work.
I currently have this, I am not sure if I am on the right track though.
SELECT SL.UserID, MIN(SL.Added) OVER (PARTITION BY SL.UserID)
FROM SourceLog AS SL
Here's one approach using row_number():
select *
from (
select *,
row_number() over (partition by userid, cast(added as date) order by added) rn
from sourcelog
) t
where rn = 1
You could use group by along with min to accomplish this.
Depending on how your data is structured if you are assigning a unique sequential number to each record created you could just return the lowest number created per day. Otherwise you would need to return the ID of the record with the earliest DATETIME value per day.
--Assumes sequential IDs
select
min(Id)
from
[YourTable]
group by
--the conversion is used to stip the time value out of the date/time
convert(date, [YourDateTime]

T-SQL (Looking for a way to find new records added under new date range)

I have a single table that compiles records and each record has a data associated with the record. For instance my most recent data range is 2014-2-8 my second data range is 2014-1-6, this query is going to be placed in a report so I would like to have the code dynamic.
I can figure out the max date using
select max((alias.date))
from table as alias1
where alias.date = alias1.date
My end game is to write a select with id, productname, date, whatever new product is in the most recent data range will pull in the query.
Essentially any product associated with max date will compare against the second to max date (sorry stupid I know) and produce those products.
I hope this makes sense.
Thank you very much.
If you just want the row with the highest date:
MSSQL: select top 1 * from table order by datefield desc
MySQL: select * from table order by datefield desc limit 1