Get the latest version of every row in SQL

Get the latest version of every row in SQL - sql

I need to grab the latest version of every row to not get duplicate data. "_sdc_sequence" is a unix epoch attached to the record during replication and determine the order of all the versions of a row.
I would like to get cost and impressions fro each campaign everyday
I have tried to use INNER JOIN but I could not get the data. when I tried to use "account" and "clientname" for attribute (every row has the same clientname and account) I got cero in cost and impressions. Maybe the attributes are wrongs
SELECT DISTINCT day, cost, impressions, campaign
FROM `adxxxxx_xxxxxxxx` account
INNER JOIN (
SELECT
MAX(_sdc_sequence) AS seq,
campaignid
FROM `adxxxxx_xxxxxxxx`
GROUP BY campaignid) clientname
ON account.campaignid = clientname.campaignid
AND account._sdc_sequence = clientname.seq
ORDER by day
There is another way to do this? or How I can fix it?
thank you

#standardSQL
SELECT row.* FROM (
SELECT ARRAY_AGG(t ORDER BY _sdc_sequence DESC LIMIT 1)[OFFSET(0)] row
FROM `adxxxxx_xxxxxxxx` t
GROUP BY campaignid
)

Related

Extract only latest status lines

I have three status of an order as Entered,Aw Appr & Completed in my table. Entered and Completed have one row each while AW Appr repeat on weekly basis till the order got completed. Now all three status but only once. AW Appr should be latest one. I am using this query, but it's giving only maximum month and max week data. It's pick available max week of complete data instated latest entry. Please help to enhance query to fetch all three status and AW Appr only latest one.
Source Table
http://sqlfiddle.com/#!18/67924/2
SELECT DISTINCT t1.Contract_Number
,t2.D_Monthx
,t2.D_Reporting_Week
,t2.D_Status
FROM Table1 t1
INNER JOIN (
SELECT DISTINCT max(D_Monthx) D_Monthx
,Max(D_Reporting_Week) D_Reporting_Week
,D_Status
,Contract_Number
FROM Table1
GROUP BY Contract_Number
,D_Status
) t2 ON t1.Contract_Number = t2.Contract_Number
AND t1.D_Monthx = t2.D_Monthx
WHERE t1.Contract_Number = '130100964/2'
Result Should be -

Using Row_Number()
Select Contract_Number, D_Monthx, D_Reporting_Week, D_Status from
(Select *,
Row_Number() over
(partition by Contract_Number,D_Status order by D_Monthx desc)
as ranking
from Table1)c
where ranking=1
SqlFiddle

We can try using ROW_NUMBER here for a straightforward solution:
SELECT Contract_Number, D_Monthx, D_Reporting_Week, D_Status
FROM
(
SELECT D_Monthx, D_Reporting_Week, D_Status, Contract_Number,
ROW_NUMBER() OVER (PARTITION BY Contract_Number, D_Status
ORDER BY D_Reporting_Week DESC) rn
FROM yourTable
) t
WHERE rn = 1;
Demo
Row number works well here, because for the Entered and Completed statuses would only ever appears once, meaning their row numbers would always be one. Similarly, the row number for the most recent AW Appr which we want to select would also be one.

How I Can get the last version for each row SQL?

I am trying to use this query to get daily costs for each campaigns, but I got the cost for only one campaign. Every campaign has the same "_sdc_sequence" but for each day there are many "_sdc_sequence". How I can get every cost for last version daily per campaign and select somes variable like day, cost, impressions and campaign? because now I get every variable of my database
"_sdc_sequence" is a unix epoch attached to the record during replication and determine the order of all the versions of a row.
I attached a picture with table. I need select only the last sequence (max _sdc_sequence)
#standardSQL
SELECT row.* FROM (
SELECT ARRAY_AGG(t ORDER BY _sdc_sequence DESC LIMIT 1)[OFFSET(0)] row
FROM `adxxxxx_xxxxxxxx` t
GROUP BY day
thanks

#standardSQL
SELECT row.* FROM (
SELECT ARRAY_AGG(t ORDER BY _sdc_sequence DESC LIMIT 1)[OFFSET(0)] row
FROM `adxxxxx_xxxxxxxx` t
GROUP BY day, campaign

SQL to Generate Periodic Snapshots from Transactions Table

I'm trying to create a periodic snapshot view from a database's transaction table after the fact. The transaction table has the following fields:
account_id (foreign key)
event_id
status_dt
status_cd
Every time an account changes status in the application, a new row is added to the transaction table with the new status. I'd like to produce a view that shows the count of accounts by status on every date; it should have the following fields:
snapshot_dt
status_cd
count_of_accounts
This will get the count for any given day, but not for all days:
SELECT status_cd, COUNT(account_id) AS count_of_accounts
FROM transactions
JOIN (
SELECT account_id, MAX(event_id) AS event_id
FROM transactions
WHERE status_dt <= DATE '2014-12-05') latest
USING (account_id, event_id)
GROUP BY status_cd
Thank you!

Okay, this is going to be hard to explain.
On each date for each status, you should count up two values:
The number of customers who start with that status.
The number of customers who leave with that status.
The first value is easy. It is just the aggregation of the transactions by the date and the status.
The second value is almost as easy. You get the previous status code and count the number of times that that status code "leaves" on that date.
Then, the key is the cumulative sum of the first value minus the cumulative sum of the second value.
I freely admit that the following code is not tested (if you had a SQL Fiddle, I'd be happy to test it). But this is what the resulting query looks like:
select status_dte, status_cd,
(sum(inc_cnt) over (partition by status_cd order by status_dt) -
sum(dec_cnt) over (partition by status_cd order by status_dt)
) as dateamount
from ((select t.status_dt, t.status_cd, count(*) as inc_cnt, 0 as dec_cnt
from transactions t
group by t.status_dt, t.status_cd
) union all
(select t.status_dt, prev_status_cd, 0, count(*)
from (select t.*
lag(t.status_cd) over (partition by t.account_id order by status_dt) as prev_status_cd
from transactions t
) t
where prev_status_cd is null
group by t.status_dt, prev_status_cd
)
) t;
If you have dates where there is no change for one or more statuses and you want to include those in the output, then the above query would need to use cross join to first create the rows in the result set. It is unclear if this is a requirement, so I'm leaving out that complication.

Select latest entry using date field (no repeats)

I have a really simple query which returns a list of item numbers, the date they were entered into the system, and the date when the entry was last modified:
SELECT DISTINCT asset_id, entered_date, modified_date
FROM price_data
The issue is that occasionally items are priced more than once, resulting in entries that have the same asset_id and entered_date, but different modified_dates. The above query works in that it returns the prices, but it returns both the original entry and the latest entry for anything that is repriced. How can I make this query return only the latest price value rather than both for any items that have been repriced?
Any help would be greatly appreciated! Many thanks.

You can group by the columns you want to be unique and thenselect for each group the highest modified_date
SELECT asset_id, entered_date, max(modified_date)
FROM price_data
GROUP BY asset_id, entered_date

select p.*
from price_data p
join (select asset_id, max(modified_date) as last_modified_date
from price_data
group by asset_id) v
on v.last_modified_date = p.modified_date
You can't group by price without impacting the results, so you have to select the latest modified date separately in an inline view and then join back to the actual table.

Fastest/most efficient way to perform this SQL Server 2008 query?

I have a table which contains:
-an ID for a financial instrument
-the price
-the date the price was recorded
-the actual time the price was recorded
-the source of the price
I want to get the index ID, the latest price, price source and the date of this latest price, for each instrument, where the source is either "L" or "R". I prefer source "L" to "R", but the latest price is more important (so if the latest price date only has a source of "R"- take this, but if for the latest date we have both, take "L").
This is the SQL I have:
SELECT tab1.IndexID, tab1.QuoteDate, tab2.Source, tab2.ActualTime FROM
(SELECT IndexID, Max(QuoteDate) as QuoteDate FROM PricesTable GROUP BY IndexID) tab1
JOIN
(SELECT IndexID, Min(Source) AS Source, Max(UpdatedTime) AS ActualTime, QuoteDate FROM PricesTable WHERE Source IN ('L','R') GROUP BY IndexID, QuoteDate) tab2
ON tab1.IndexID = tab2.IndexID AND tab1.QuoteDate = tab2.QuoteDate
However, I also want to extract the price field but cannot get this due to the GROUP BY clause. I cannot extract the price without including price in either the GROUP BY, or an aggregate function.
Instead, I have had to join the above SQL code to another piece of SQL which just gets the prices and index IDs and joins on the index ID.
Is there a faster way of performing this query?
EDIT: thanks for the replies so far. Would it be possible to have some advice on which are more efficient in terms of performance?
Thanks

Use ROW_NUMBER within a subquery or CTE to order the rows how you're interested in them, then just select the rows that come at the top of that ordering. (Use PARITION so that row numbers are reaassigned starting at 1 for each IndexId):
;WITH OrderedValues as (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY IndexID ORDER BY QuoteDate desc,Source asc) as rn
FROM
PricesTable
)
SELECT * from OrderedValues where rn=1

Try:
select * from
(select p.*,
row_number() over (partition by IndexID
order by QuoteDate desc, Source) rn
from PricesTable p
where Source IN ('L','R')
) sq
where rn = 1
(This syntax should work in relatively recent versions of Oracle, SQLServer or PostgreSQL, but won't work in MySQL.)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Get the latest version of every row in SQL - sql

#standardSQL SELECT row.* FROM ( SELECT ARRAY_AGG(t ORDER BY _sdc_sequence DESC LIMIT 1)[OFFSET(0)] row FROM `adxxxxx_xxxxxxxx` t GROUP BY campaignid )

Related

Extract only latest status lines

How I Can get the last version for each row SQL?

SQL to Generate Periodic Snapshots from Transactions Table

Select latest entry using date field (no repeats)

Fastest/most efficient way to perform this SQL Server 2008 query?

Categories

Resources