How to select the last record of each ID - sql

I need to extract the last records of each user from the table. The table schema is like below.
mytable
product | user_id |
-------------------
A | 15 |
B | 15 |
A | 16 |
C | 16 |
-------------------
The output I want to get is
product | user_id |
-------------------
B | 15 |
C | 16 |
Basically the last records of each user.
Thanks in advance!

You can use a window function called ROW_NUMBER.Here is a solution for you given below. I have also made a demo query in db-fiddle for you. Please check link Demo Code in DB-Fiddle
WITH CTE AS
(SELECT product, user_id,
ROW_NUMBER() OVER(PARTITION BY user_id order by product desc)
as RN
FROM Mytable)
SELECT product, user_id FROM CTE WHERE RN=1 ;

You can try using row_number()
select product,iserid
from
(
select product, userid,row_number() over(partition by userid order by product desc) as rn
from tablename
)A where rn=1

There is no such thing as a "last" record unless you have a column that specifies the ordering. SQL tables represent unordered sets (well technically, multisets).
If you have such a column, then use distinct on:
select distinct on (user_id) t.*
from t
order by user_id, <ordering col> desc;
Distinct on is a very handy Postgres extension that returns one row per "group". It is the first row based on the ordering specified in the order by clause.

You should have a column that stores the insertion order. Whether through auto increment or a value with date and time.
Ex:
autoIncrement
produt
user_id
1
A
15
2
B
15
3
A
16
4
C
16
SELECT produt, user_id FROM table inner join
( SELECT MAX(autoIncrement) as id FROM table group by user_id ) as table_Aux
ON table.autoIncrement = table_Aux.id

Related

Is there a way to calculate average based on distinct rows without using a subquery?

If I have data like so:
+----+-------+
| id | value |
+----+-------+
| 1 | 10 |
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
| 2 | 20 |
+----+-------+
How do I calculate the average based on the distinct id WITHOUT using a subquery (i.e. querying the table directly)?
For the above example it would be (10+20+30)/3 = 20
I tried to do the following:
SELECT AVG(IF(id = LAG(id) OVER (ORDER BY id), NULL, value)) AS avg
FROM table
Basically I was thinking that if I order by id and check the previous row to see if it has the same id, the value should be NULL and thus it would not be counted into the calculation, but unfortunately I can't put analytical functions inside aggregate functions.
As far as I know, you can't do this without a subquery. I would use:
SELECT AVG(avg_value)
FROM
(
SELECT AVG(value) AS avg_value
FROM yourTable
GROUP BY id
) t;
WITH RANK AS (
Select *,
ROW_NUMBER() OVER(PARTITION BY ID) AS RANK
FROM
TABLE
QUALIFY RANK = 1
)
SELECT
AVG(VALUES)
FROM RANK
The outer query will have other parameters that need to access all the data in the table
I interpret this comment as wanting an average on every row -- rather than doing an aggregation. If so, you can use window functions:
select t.*,
avg(case when seqnum = 1 then value end) over () as overall_avg
from (select t.*,
row_number() over (partition by id order by id) as seqnum
from t
) t;
Yes there is a way,
Simply use distinct inside the avg function as below :
select avg(distinct value) from tab;
http://sqlfiddle.com/#!4/9d156/2/0

SQL Query find users with only one product type

I solemnly swear I did my best to find an existing question, may I'm not sure how to phrase it correctly.
I would like to return records for users that have quota for only one product type.
| user_id | product |
| 1 | A |
| 1 | B |
| 1 | C |
| 2 | B |
| 3 | B |
| 3 | C |
| 3 | D |
In the example above I'd like a query that only returns users who carry quota for only one product type - doesn't really matter which product at this point.
I tried using select user_id, product from table group by 1,2 having count(user) < 2 but this does not work, nor does select user_id, product from table group by 1,2 having count(*) < 2
Any help is appreciated.
Your having clause is good; the issue's with your group by. Try this:
select user_id
, count(distinct product) NumberOfProducts
from table
group by user_id
having count(distinct product) = 1
Or you could do this; which is closer to your original:
select user_id
from table
group by user_id
having count(*) < 2
The group by clause can't take ordinal arguments (like, e.g., the order by clause can). When grouping by a value like 1, you're in fact grouping by the literal value 1, which would just be the same for any row in the table, and thus will group all the rows in the table to one group. Since there are more than one product in the entire table, no rows will be returned.
Instead, you should group by the user_id:
SELECT user_id
FROM mytable
GROUP BY user_id
HAVING COUNT(*) = 1
If you want the product, then do:
select user_id, max(product) as product
from table
group by user_id
having min(product) = max(product);
The having clause could also be:
having count(distinct product) = 1

How can i select only id of min created date in each group [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 6 years ago.
Imagine next tables
Ticket Table
========================
| id | question |
========================
| 1 | Can u help me :)? |
========================
UserEntry Table
======================================================
| id | answer | dateCreated | ticket_id |
======================================================
| 2 | It's my plessure :)? | 2016-08-05 | 1 |
=======================================================
| 3 | How can i help u ? | 2016-08-06 | 1 |
======================================================
So how can I only get id of row for each group which have min date value
So my expected answer should be like that
====
| id |
====
| 2 |
====
UPDATE:
I got the solution in next query
SELECT id FROM UserEntry WHERE datecreated IN (SELECT MIN(datecreated) FROM CCUserEntry GROUP BY ticket_id)
Improved Answer
SELECT id FROM UserEntry WHERE (ticket_id, datecreated) IN
(SELECT ticket_id, MIN(datecreated) FROM UserEntry GROUP BY ticket_id);
Also this is a good and right answer too (NOTE: DISTINCT ON is not a part of the SQL standard.)
SELECT DISTINCT ON (ue.ticket_id) ue.id
FROM UserEntry ue
ORDER BY ue.ticket_id, ue.datecreated
It seems you want to select the ID with the minimum datecreated. That is simple: select the minimum date and then select the id(s) matching this date.
SELECT id FROM UserEntry WHERE datecreated = (SELECT MIN(datecreated) FROM UserEntry);
If you are sure you won't have ties or if you are fine with just one row anyway, you can also use FETCH FIRST ROW ONLY which doesn't have a tie clause in PostgreSQL unfortunately.
SELECT id FROM UserEntry ORDER BY datecreated FETCH FIRST ROW ONLY;
UPDATE: You want the entry ID for the minimum date per ticket. Per ticket translates to GROUP BY ticket_id in SQL.
SELECT ticket_id, id FROM UserEntry WHERE (ticket_id, datecreated) IN
(SELECT ticket_id, MIN(datecreated) FROM UserEntry GROUP BY ticket_id);
The same can be achieved with window functions where you read the table only once:
SELECT ticket_id, id
FROM
(
SELECT ticket_id, id, RANK() OVER (PARTITION BY ticket_id ORDER BY datecreated) AS rnk
FROM UserEntry
) ranked
WHERE rnk = 1;
(Change SELECT ticket_id, id to SELECT id if you want the queries not to show the ticket ID, which would make the results harder to understand of course :-)
You may want fetch first row only or distinct on (if you care about more than one ticket):
SELECT DISTINCT ON (ue.ticket_id) ue.id
FROM UserEntry ue
ORDER BY ue.ticket_id, ue.date_created
This will get the id on the row with the minimum date_created value.
A solution with ANSI SQL that works on a wide range of DBMS that support modern SQL is to use window functions:
select id
from (
select id, row_number() over (partition by ticket_id order by date_created) as rn
from userentry
) t
where rn = 1;
Note that in Postgres, Gordon's solution using distinct on () is usually faster then using window functions

Selecting compared pairs from table

I don't really know how to describe it. I have a table:
ID | Name | Date
-------------------------
1 | Mike | 01.01.2016
1 | Michael | 02.03.2016
2 | Samuel | 23.12.2015
2 | Sam | 05.03.2015
3 | Tony | 02.04.2012
I want to select pairs of IDs and Names with latest dates in each pair. The result here should be:
ID | Name | Date
-------------------------
1 | Michael | 02.03.2016
2 | Samuel | 23.12.2015
3 | Tony | 02.04.2012
How do I achieve this?
Oracle Database 11g
You can do it using the ROW_NUMBER() analytic function:
SELECT id, name, "date"
FROM (
SELECT t.*,
ROW_NUMBER() OVER ( PARTITION BY id ORDER BY "date" DESC ) rn
FROM table_name t
)
WHERE rn = 1
This requires only a single table scan (it does not have a self-join or correlated sub-query - i.e. IN (...) or EXISTS(...)).
Have a sub-select that returns each id and it's max date:
select * from table
where (id, date) in (select id, max(date) from table group by id)
You can use NOT EXISTS() :
SELECT * FROM YourTable t
WHERE NOT EXISTS(SELECT 1 FROM YourTable s
WHERE t.id = s.id and s.date > t.date)
Possibly the most efficient method is:
select t.*
from table t
where t.date = (select max(date) from table t2 where t2.id = t.id);
along with an index on table(id, date).
This version should scan the table and look up the correct value in the index.
Or, if there are only three columns, you can use keep:
select id, max(date) as date,
max(name) keep (dense_rank first order by date desc) as name
from table
group by id;
I have found that this version works very well in Oracle.

Compare different orders of the same table

I have this following scenario, a table with these columns:
table_id|user_id|os_number|inclusion_date
In the system, the os_number is sequential for the users, but due to a system bug some users inserted OSs in wrong order. Something like this:
table_id | user_id | os_number | inclusion_date
-----------------------------------------------
1 | 1 | 1 | 2015-11-01
2 | 1 | 2 | 2015-11-02
3 | 1 | 3 | 2015-11-01
Note the os number 3 inserted before the os number 2
What I need:
Recover the table_id of the rows 2 and 3, which is out of order.
I have these two select that show me the table_id in two different orders:
select table_id from table order by user_id, os_number
select table_id from table order by user_id, inclusion_date
I can't figure out how can I compare these two selects and see which users are affected by this system bug.
Your question is a bit difficult because there is no correct ordering (as presented) -- because dates can have ties. So, use the rank() or dense_rank() function to compare the two values and return the ones that are not in the correct order:
select t.*
from (select t.*,
rank() over (partition by user_id order by inclusion_date) as seqnum_d,
rank() over (partition by user_id order by os_number) as seqnum_o
from t
) t
where seqnum_d <> seqnum_o;
Use row_number() over both orders:
select *
from (
select *,
row_number() over (order by os_number) rnn,
row_number() over (order by inclusion_date) rnd
from a_table
) s
where rnn <> rnd;
table_id | user_id | os_number | inclusion_date | rnn | rnd
----------+---------+-----------+----------------+-----+-----
3 | 1 | 3 | 2015-11-01 | 3 | 2
2 | 1 | 2 | 2015-11-02 | 2 | 3
(2 rows)
Not entirely sure about the performance on this but you could use a cross apply on the same table to get the results in one query. This will bring up the pairs of table_ids which are incorrect.
select
a.table_id as InsertedAfterTableId,
c.table_id as InsertedBeforeTableId
from table a
cross apply
(
select b.table_id
from table b
where b.inclusion_date < a.inclusion_date and b.os_number > a.os_number
) c
Both query examples given below simply check a mismatch between inclusion date and os_number:
This first query should return the offending row (the one whose os_number is off from its inclusion date)--in the case of the example row 3.
select table.table_id, table.user_id, table.os_number from table
where EXISTS(select * from table t
where t.user_id = table.user_id and
t.inclusion_date > table.inclusion_date and
t.os_number < table.os_number);
This second query will return the table numbers and users for two rows that are mismatched:
select first_table.table_id, second_table.table_id, first_table.user_id from
table first_table
JOIN table second_table
ON (first_table.user_id = second_table.user_id and
first_table.inclusion_date > second_table.inclusion_date and
first_table.os_number < second_table.os_number);
I would use WINDOW FUNCTIONS to get row numbers in orders in question and then compare them:
SELECT
sub.table_id,
sub.user_id,
sub.os_number,
sub.inclusion_date,
number_order_1, number_order_2
FROM (
SELECT
table_id,
user_id,
os_number,
inclusion_date,
row_number() OVER (PARTITION BY user_id
ORDER BY os_number
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING
) AS number_order_1,
row_number() OVER (PARTITION BY user_id
ORDER BY inclusion_date
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING
) AS number_order_2
FROM
table
) sub
WHERE
number_order_1 <> number_order_1
;
EDIT:
Because of a_horse_with_no_name made good point about my final answer. I've back to my first answer (look in edit history) which work also if os_number isn't gapless.
select *
from (
select a_table.*,
lag(inclusion_date) over (partition by user_id order by os_number) as last_date
from a_table
) result
where last_date is not null AND last_date>inclusion_date;
This should cover gaps as well as ties. Basically, I simply check the inclusion_date of the last os_number, and make sure it's not strictly greater than the current date (so 2 version on the same date is fine).