Taking Last Element of Every User ID In SQL Table

Taking Last Element of Every User ID In SQL Table - sql

I am trying to figure out how to do the following in SQL (I'm specifically working in Teradata). Given the following table of user IDs, transaction date, and item bought, I'm trying to figure out how for each user ID to get the last item bought, for example:
User ID Date Product
123 12/01/1996 A
123 12/02/1996 B
123 12/03/1996 C
124 12/01/1996 B
124 12/04/1996 A
123 12/05/1996 D
So the query would return in this case:
User ID Last Product Bought
123 D
124 A
And so forth. I tried using a Partition By or Window function in Teradata, but could not figure out how to implement it.
Thanks for your help.

Apply Teradata's proprietary syntax for filtering Windowed Aggregates:
select *
from tab
qualify
row_number()
over (partition by User_ID -- each user
order by Date_col desc) = 1 -- lastest row

In Teradata, you can use row_number() and qualify to solve this top-1-per-group problem:
select t.*
from mytable t
qualify row_number() over(partition by yser_id order by date desc) = 1

Related

Given Netezza does not support First and Last when aggregating, how to proceed?

I would like to group data on some column called CustID and select their first or the last mortgage even if the mortgages were originated on the same date. How do you do that in Netezza? In MS Access I normally use the First or Last aggregation functions for that.
Data comes like this:
CustID mortgageID pass_dt
101 090234W 1-23-1989
101 103120X 5-20-2020
101 103121V 5-20-2020
So here I want either the second or the third record but not both when as extra criterium pass_dt = 5-20-2020.
Thanks very much!

If you want the entire record, use window functions:
select t.*
from (select t.*, row_number() over (partition by custid order by pass_dt desc) as seqnum
from t
) t
where seqnum = 1

select CustID, max(mortgageID), max(pass_dt) from t1 group by 1;

Snowflake SQL code to show only second record for items with duplicate ID

I'm trying to get my head around SQL and am using Snowflake as a testbed to do this. I have a table with products which have multiple reviews against them. I am trying to structure a query to only show products with 2 or more reviews and then only show the second review. As I say, this is merely me trying to better understand SQL so selecting the second review is a random ask. The table is made up of 4 columns. 1 is Product ID, 2 is Product Name, 3 is Review and 4 is Date Review was posted.
Thanks in advance for any help.

You use row_number() for this type of query:
select t.*
from (select t.*,
row_number() over (partition by product_id order by date_review asc) as seqnum
from t
) t
where seqnum = 2;

You can use a windowing function like ROW_NUMBER() to make numbered groupings, eg:
WITH Review_Sequence (
SELECT r.*,
ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Review_Date) Review_No
FROM Reviews r
)
SELECT * FROM Review_Sequence WHERE Review_No = 2

Finding the first occurrence of an element in a SQL database

I have a table with a column for customer names, a column for purchase amount, and a column for the date of the purchase. Is there an easy way I can find how much first time customers spent on each day?
So I have
Name | Purchase Amount | Date
Joe 10 9/1/2014
Tom 27 9/1/2014
Dave 36 9/1/2014
Tom 7 9/2/2014
Diane 10 9/3/2014
Larry 12 9/3/2014
Dave 14 9/5/2014
Jerry 16 9/6/2014
And I would like something like
Date | Total first Time Purchase
9/1/2014 73
9/3/2014 22
9/6/2014 16
Can anyone help me out with this?

The following is standard SQL and works on nearly all DBMS
select date,
sum(purchaseamount) as total_first_time_purchase
from (
select date,
purchaseamount,
row_number() over (partition by name order by date) as rn
from the_table
) t
where rn = 1
group by date;
The derived table (the inner select) selects all "first time" purchases and the outside the aggregates based on the date.

The two key concepts here are aggregates and sub-queries, and the details of which dbms you're using may change the exact implementation, but the basic concept is the same.
For each name, determine they're first date
Using the results of 1, find each person's first day purchase amount
Using the results of 2, sum the amounts for each date
In SQL Server, it could look like this:
select Date, [totalFirstTimePurchases] = sum(PurchaseAmount)
from (
select t.Date, t.PurchaseAmount, t.Name
from table1 t
join (
select Name, [firstDate] = min(Date)
from table1
group by Name
) f on t.Name=f.Name and t.Date=f.firstDate
) ftp
group by Date

If you are using SQL Server you can accomplish this with either sub-queries or CTEs (Common Table Expressions). Since there is already an answer with sub-queries, here is the CTE version.
First the following will identify each row where there is a first time purchase and then get the sum of those values grouped by date:
;WITH cte
AS (
SELECT [Name]
,PurchaseAmount
,[date]
,ROW_NUMBER() OVER (
PARTITION BY [Name] ORDER BY [date] --start at 1 for each name at the earliest date and count up, reset every time the name changes
) AS rn
FROM yourTableName
)
SELECT [date]
,sum(PurchaseAmount) AS TotalFirstTimePurchases
FROM cte
WHERE rn = 1
GROUP BY [date]

SQL Query to obtain the maximum value for each unique value in another column

ID Sum Name
a 10 Joe
a 8 Mary
b 21 Kate
b 110 Casey
b 67 Pierce
What would you recommend as the best way to
obtain for each ID the name that corresponds to the largest sum (grouping by ID).
What I tried so far:
select ID, SUM(Sum) s, Name
from Table1
group by ID, Name
Order by SUM(Sum) DESC;
this will arrange the records into groups that have the highest sum first. Then I have to somehow flag those records and keep only those. Any tips or pointers? Thanks a lot
In the end I'd like to obtain:
a 10 Joe
b 110 Casey

You want the row_number() function:
select id, [sum], name
from (select t.*]
row_number() over (partition by id order by [sum] desc) as seqnum
from table1
) t
where seqnum = 1;
Your question is more confusing than it needs to be because you have a column called sum. You should avoid using SQL reserved words for identifiers.
The row_number() function assigns a sequential number to a group of rows, starting with 1. The group is defined by the partition by clause. In this case, all rows with the same id are in the same group. The ordering of the numbers is determined by the order by clause, so the one with the largest value of sum gets the value of 1.
If you might have duplicate maximum values and you want all of them, use the related function rank() or dense_rank().

select *
from
(
select *
,rn = row_number() over (partition by Id order by sum desc)
from table
)x
where x.rn=1
demo

How can I SELECT additional columns with a TSQL query using GROUP BY

I have a view (that is a union of several tables) and I need to filter out duplicates. The table looks like this:
id first last logo email entered
1 joe smith i.jpg e#m.c 2014-01-27
2 jim smith b.jpg e#j.c 2014-01-27
3 bob smith z.jpg b#b.c 2014-01-27
9 joeseph smith q.gif e#m.c 2014-01-20
I want to do something like this, but I can't seem to get a valid syntax for it:
SELECT
email, MAX(entered), first, last -- such that first and last come from the same row as the MAX(entered)
FROM
my_view
GROUP BY
email

Since your names are not the same on the duplicate email rows, you must use the row_number() function instead:
select email, entered, first, last
from (
select *, row_number() over (partition by email order by entered desc) rn
from my_view
) x
where rn = 1
You need a subquery because row_number() is not allowed in the where clause.

You want to use row_number():
SELECT email, entered, first, last
FROM (select v.*, row_number() over (partition by email order by entered desc) as seqnum
from my_view v
) v
WHERE seqnum = 1;
row_number() is a window function that assigns sequential numbers to groups of rows. The groups are defined by the partition by clause. In this case, everything with the same email is in the same group. The first row is given a value 1; the ordering is based on the order by clause.
The outer query select the first one, which has the largest entered date.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Taking Last Element of Every User ID In SQL Table - sql

Apply Teradata's proprietary syntax for filtering Windowed Aggregates: select * from tab qualify row_number() over (partition by User_ID -- each user order by Date_col desc) = 1 -- lastest row

In Teradata, you can use row_number() and qualify to solve this top-1-per-group problem: select t.* from mytable t qualify row_number() over(partition by yser_id order by date desc) = 1

Related

Given Netezza does not support First and Last when aggregating, how to proceed?

Snowflake SQL code to show only second record for items with duplicate ID

Finding the first occurrence of an element in a SQL database

SQL Query to obtain the maximum value for each unique value in another column

How can I SELECT additional columns with a TSQL query using GROUP BY

Categories

Resources