retrieve a random row for a group - sql

I have a table with following columns
1. ID
2. UserID
3. ImageUrl
I want retrieve a random ImageUrl for each UserID. For example, there 4 rows in the table
1 12251 Winter.jpg
2 12251 Summer.jpg
3 33333 Fall.jpg
4 33333 Spring.jpg
and the query retrieve the following rows
1 12251 Winter.jpg
4 33333 Spring.jpg

select userid,picture from
(
select userid, picture, ROW_NUMBER() over (partition by userid order by newid()) rn
from yourtable
) v
where rn =1
order by xtype

Related

How to get rank of a user from all users

I have table called summary_coins , By ranking of coins I am trying to get an user ranking
I have tried like below
SELECT
user_id,
sum(get_count),
rank() over (order by sum(get_count) asc) as rank
FROM summary_coins
WHERE user_id = 2
GROUP BY user_id
sample data , without user_id = 2 in where I am getting below list
user_id sum rank
44 2 1
13 4 2
57 4 2
47 4 2
11 5 5
2 5 5
My desire out put :
2 5 5
Here I am always getting ranking 1 for user ID 2 , But from list of user it should be rank 5.
You want to apply WHERE user_id = 2 late. RANK OVER is the last thing to happen in your query, but you want to apply the WHERE clause afterwards. In order to do this make your query a subquery you select from:
SELECT user_id, sum_count, rank
FROM
(
SELECT
user_id,
sum(get_count) AS sum_count,
rank() over (order by sum(get_count) asc) as rank
FROM summary_coins
GROUP BY user_id
) all_users
WHERE user_id = 2;

How to "filter" records in Hive table?

Imagine table with id, status and modified_date. One id can have more than one record in table. I need to get out only that row for each id that has current status together with the modified_date when this status has changed from older one to current.
id status modified_date,
--------------------------------------------
1 T 1-Jan,
1 T 2-Jan,
1 F 3-Jan,
1 F 4-Jan,
1 T 5-Jan,
1 T 6-Jan,
2 F 18-Feb,
2 F 20-Feb,
2 T 21-Feb,
3 F 1-Mar,
3 F 1-Mar,
3 F 2-Mar,
With everything I already did I can not capture the second change for person 1 from F to T on 5-Jan.
So I expect results :
id status modified_date,
--------------------------------------------
1 T 5-Jan,
2 T 21-Feb,
3 F 1-Mar,
Using lag() analytic function you can address previous row to calculate status_changed flag. Then use row_number to mark last status changed rows with 1 and filter them. See comments in the code:
with your_data as (--replace with your table
select stack(12,
1,'T','1-Jan',
1,'T','2-Jan',
1,'F','3-Jan',
1,'F','4-Jan',
1,'T','5-Jan',
1,'T','6-Jan',
2,'F','18-Feb',
2,'F','20-Feb',
2,'T','21-Feb',
3,'F','1-Mar',
3,'F','1-Mar',
3,'F','2-Mar') as (id,status,modified_date)
)
select id,status,modified_date
from
(
select id,status,modified_date,status_changed_flag,
row_number() over(partition by id, status_changed_flag order by modified_date desc) rn
from
(
select t.*,
--lag(status) over(partition by id order by modified_date) prev_status,
NVL((lag(status) over(partition by id order by modified_date)!=status), true) status_changed_flag
from your_data t
)s
)s where status_changed_flag and rn=1
order by id --remove ordering if not necessary
;
Result:
OK
id status modified_date
1 T 5-Jan
2 T 21-Feb
3 F 1-Mar
Time taken: 178.643 seconds, Fetched: 3 row(s)

sql - select single ID for each group with the lowest value

Consider the following table:
ID GroupId Rank
1 1 1
2 1 2
3 1 1
4 2 10
5 2 1
6 3 1
7 4 5
I need an sql (for MS-SQL) select query selecting a single Id for each group with the lowest rank. Each group needs to only return a single ID, even if there are two with the same rank (as 1 and 2 do in the above table). I've tried to select the min value, but the requirement that only one be returned, and the value to be returned is the ID column, is throwing me.
Does anyone know how to do this?
Use row_number():
select t.*
from (select t.*,
row_number() over (partition by groupid order by rank) as seqnum
from t
) t
where seqnum = 1;

SQL MAX(column) With Additional Criteria

I have a single table, where I want to return a list of the MAX(id) GROUPed by another identifier. However I have a third column that, when it meets a certain criteria, "trumps" rows that don't meet that criteria.
Probably easier to explain with an example. Sample table has:
UniqueId (int)
GroupId (int)
IsPriority (bit)
Raw data:
UniqueId GroupId IsPriority
-----------------------------------
1 1 F
2 1 F
3 1 F
4 1 F
5 1 F
6 2 T
7 2 T
8 2 F
9 2 F
10 2 F
So, because no row in groupId 1 has IsPriority set, we return the highest UniqueId (5). Since groupId 2 has rows with IsPriority set, we return the highest UniqueId with that value (7).
So output would be:
5
7
I can think of ways to brute force this, but I am looking to see if I can do this in a single query.
SQL Fiddle Demo
WITH T
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY GroupId
ORDER BY IsPriority DESC, UniqueId DESC ) AS RN
FROM YourTable)
SELECT UniqueId,
GroupId,
IsPriority
FROM T
WHERE RN = 1

Applying a sort order to existing data using SQL 2008R2

I have some existing data that I need to apply a "SortOrder" to based upon a few factors:
The ordering starts at "1" for any given Owner
The ordering is applied alphabetically (basically following an ORDER BY Name) to increase the sort order.
Should two items have the same name (as I've illustrated in my data set), we can apply the lower sort order value to the item with the lower id.
Here is some sample data to help illustrate what I'm talking about:
What I have:
Id OwnerId Name SortOrder
------ ------- ---------------------- ---------
1 1 A Name NULL
2 1 C Name NULL
3 1 B Name NULL
4 2 Z Name NULL
5 2 Z Name NULL
6 2 A Name NULL
What I need:
Id OwnerId Name SortOrder
------ ------- ---------------------- ---------
1 1 A Name 1
3 1 B Name 2
2 1 C Name 3
6 2 A Name 1
4 2 Z Name 2
5 2 Z Name 3
This could either be done in the form of an UPDATE statement or doing an INSERT INTO (...) SELECT FROM (...) if it's easier to move the data from one table to the next.
Easy - use a CTE (Common Table Expression) and the ROW_NUMBER() ranking function:
;WITH OrderedData AS
(
SELECT Id, OwnerId, Name,
ROW_NUMBER() OVER(PARTITION BY OwnerId ORDER BY Name, Id) AS 'SortOrder'
FROM
dbo.YourTable
)
SELECT *
FROM OrderedData
ORDER BY OwnerId, SortOrder
The PARTITION BY clause groups your data into group for each value of OwnerId and the ROW_NUMBER() then starts counting at 1 for each new group of data.
Update: If you want to update your table to set the SortOrder column - try this:
;WITH OrderedData AS
(
SELECT
Id, OwnerId, Name,
ROW_NUMBER() OVER(PARTITION BY OwnerId ORDER BY Name, Id) AS 'RowNum'
FROM
dbo.YourTable
)
UPDATE OrderedData
SET SortOrder = RowNum
That should set the SortOrder column to the values that the ROW_NUMBER() function returns