SQL Count by equal columns Query

SQL Count by equal columns Query - sql

I have this table:
subscriberID | date | segmentID | Counter
------------------------------------------
1 | 1.1 | 2 | 3
1 | 2.1 | 4 | 2
1 | 3.1 | 4 | 5
2 | 1.1 | 1 | 12
2 | 2.1 | 1 | 1
2 | 3.1 | 2 | 10
3 | 1.1 | 2 | 4
I have to write SQL Query that does:
Get the top 3 most common segmentID's (by counter) for a given subscriberID.
can anyone help me with that?
Thanks.

select segmentID
from your_table
where subscriberID = 123
group by segmentID
order by sum(counter) desc
To get only 3 records you have to limit your result. Depending on your DB engine that could be top 3 or limit 3 or rownum <= 3.

Related

SQL : Get the 3 first occurrences of a field

I have a PostgreSQL table with 2 fields like the following. Field A is the primary key.
A | B
------
1 | 1
2 | 1
3 | 1
4 | 1
5 | 2
6 | 2
7 | 2
8 | 2
9 | 2
10 | 3
11 | 3
I'm looking for a request to get only the 3 first occurrences of B, like this:
A | B
1 | 1
2 | 1
3 | 1
5 | 2
6 | 2
7 | 2
10 | 3
11 | 3
Does somebody have a solution?

You want row_number() :
select t.*
from (select t.*, row_number() over (partition by b order by a) as seq
from table t
) t
where seq <= 3;

hive - split a row into multiple rows between the range of values

I have a table below and would like to split the rows by the range from start to end columns.
i.e id and value should repeat for each value between start & end(both inclusive)
--------------------------------------
id | value | start | end
--------------------------------------
1 | 5 | 1 | 4
2 | 8 | 5 | 9
--------------------------------------
Desired output
--------------------------------------
id | value | current
--------------------------------------
1 | 5 | 1
1 | 5 | 2
1 | 5 | 3
1 | 5 | 4
2 | 8 | 5
2 | 8 | 6
2 | 8 | 7
2 | 8 | 8
2 | 8 | 9
--------------------------------------
I can write my own UDF in java/python to get this result but would like to check if I can implement in Hive SQL using any existing hive UDFs
Thanks in advance.

This can be accomplished with a recursive common table expression, which Hive doesn't support.
One option is to create a table of numbers and use it to generate rows between start and end.
create table numbers
location 'hdfs_location' as
select row_number() over(order by somecolumn) as num
from some_table --this can be any table with the desired number of rows
;
--Join it with the existing table
select t.id,t.value,n.num as current
from tbl t
join numbers n on n.num>=t.start and n.num<=t.end

You can do using posexplode() UDF.
WITH
data AS (
SELECT 1 AS id, 5 AS value, 1 AS start, 4 AS `end`
UNION ALL
SELECT 2 AS id, 8 AS value, 5 AS start, 9 AS `end`
)
SELECT distinct id, value, (zr.start+rge.diff) as `current`
FROM data zr LATERAL VIEW posexplode(split(space(zr.`end`-zr.start),' ')) rge as diff, x
Here is its Output:
+-----+--------+----------+--+
| id | value | current |
+-----+--------+----------+--+
| 1 | 5 | 1 |
| 1 | 5 | 2 |
| 1 | 5 | 3 |
| 1 | 5 | 4 |
| 2 | 8 | 5 |
| 2 | 8 | 6 |
| 2 | 8 | 7 |
| 2 | 8 | 8 |
| 2 | 8 | 9 |
+-----+--------+----------+--+

How to select if similar field count is the maximum in the table?

I want to select from a table if row counts of similar filed is maximum depends on other columns.
As example
| user_id | team_id | isOk |
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 1 |
| 4 | 1 | 1 |
| 5 | 2 | 1 |
| 6 | 2 | 1 |
| 7 | 2 | 1 |
| 8 | 3 | 1 |
| 9 | 3 | 1 |
| 10 | 3 | 1 |
| 11 | 3 | 0 |
So i want to select team 1 and 2 because they all have 1 value at isOk Column,
i tried to use this query
SELECT Team
FROM _Table1
WHERE isOk= 1
GROUP BY Team
HAVING COUNT(*) > 3
But still i have to define a row count which can be maximum or not.
Thanks in advance.

Is this what you are looking for?
select team
from _table1
group by team
having min(isOk) = 1;

How to get the max row count grouped by the ID in sql

Say I have this table:
(column: Row is a count based on the column ID)
ID | Row | State |
1 | 1 | CA |
1 | 2 | AK |
2 | 1 | KY |
2 | 2 | GA |
2 | 3 | FL |
3 | 1 | WY |
3 | 2 | HI |
3 | 3 | NY |
3 | 4 | DC |
4 | 1 | RI |
I'd like to generate a new column that would have the highest number in the Row column grouped by the ID column for each row. How would I accomplish this? I've been messing around with MAX(), GROUP BY, and some partitioning but I'm getting different errors each time. It's difficult to finesse this correctly. Here's my target output:
ID | Row | State | MaxRow
1 | 1 | CA | 2
1 | 2 | AK | 2
2 | 1 | KY | 3
2 | 2 | GA | 3
2 | 3 | FL | 3
3 | 1 | WY | 4
3 | 2 | HI | 4
3 | 3 | NY | 4
3 | 4 | DC | 4
4 | 1 | RI | 1

Use window version of MAX:
SELECT ID, Row, State, MAX(Row) OVER (PARTITION BY ID) AS MaxRow
FROM mytable
Demo here

You could join between a query on the table and an aggregate table:
SELECT t.*, max_row
FROM t
JOIN (SELECT id, MAX([row]) AS max_row
FROM t
GROUP BY id) agg ON t.id = agg.id

You can create first a query using group by id and max to get the highest number. Then use this query as a sub query and use the id to inner join.
Then use the max column from the sub query to obtain your final result.

SQL - Select distinct on two column

I have this table 'words' with more information:
+---------+------------+-----------
| ID |ID_CATEGORY | ID_THEME |
+---------+------------+-----------
| 1 | 1 | 1
| 2 | 1 | 1
| 3 | 1 | 1
| 4 | 1 | 2
| 5 | 1 | 2
| 6 | 1 | 2
| 7 | 2 | 3
| 8 | 2 | 3
| 9 | 2 | 3
| 10 | 2 | 4
| 11 | 2 | 4
| 12 | 3 | 5
| 13 | 3 | 5
| 14 | 3 | 6
| 15 | 3 | 6
| 16 | 3 | 6
And this query that gives to me 3 random ids from different categories, but not from different themes too:
SELECT Id
FROM words
GROUP BY Id_Category, Id_Theme
ORDER BY RAND()
LIMIT 3
What I want as result is:
+---------+------------+-----------
| ID |ID_CATEGORY | ID_THEME |
+---------+------------+-----------
| 2 | 1 | 1
| 7 | 2 | 3
| 14 | 3 | 6
That is, repeat no category or theme.

When you use GROUP BY you cannot include in the select list a column which is not being ordered. So, in your query it's impossible to inlcude Id in the select list.
So you need to do something a bit more complex:
SELECT Id_Category, Id_Theme,
(SELECT Id FROM Words W
WHERE W.Id_Category = G.Id_Category AND W.Id_Theme = G.Id_Theme
ORDER BY RAND() LIMIT 1
) Id
FROM Words G
GROUP BY Id_Category, Id_Theme
ORDER BY RAND()
LIMIT 3
NOTE: the query groups by the required columns, and the subselect is used to take a random Id from all the possible Ids in the group. Then main query is filtered to take three random rows.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Count by equal columns Query - sql

select segmentID from your_table where subscriberID = 123 group by segmentID order by sum(counter) desc To get only 3 records you have to limit your result. Depending on your DB engine that could be top 3 or limit 3 or rownum <= 3.

Related

SQL : Get the 3 first occurrences of a field

hive - split a row into multiple rows between the range of values

How to select if similar field count is the maximum in the table?

How to get the max row count grouped by the ID in sql

SQL - Select distinct on two column

Categories

Resources