Selecting distinct values within a a group - sql

I want to select distinct values of one variable within a group defined by another variable. What is the easiest way?
My first thought was to combine group by and distinct but it does not work. I tried something like:
select distinct col2, col1 from myTable
group by col1
I have looked at this one here but can't seem to solve my problem
Using DISTINCT along with GROUP BY in SQL Server_
Table example

If your requirement is to pick distinct combinations if col1 and COL2 then no need to group by just use
SELECT DISTINCT COL1, COL2 FROM TABLE1;
But if you want to group by then automatically one record per group is displayed by then you have to use aggregate function of one of the columns i.e.
SELECT COL1, COUNT(COL2)
FROM TABLE1 GROUP BY COL1;

no need group by just use distinct
select distinct col2, col1 from myTable

create table t as
with inputs(val, id) as
(
select 'A', 1 from dual union all
select 'A', 1 from dual union all
select 'A', 2 from dual union all
select 'B', 1 from dual union all
select 'B', 2 from dual union all
select 'C', 3 from dual
)
select * from inputs;
The above creates your table and the below is the solution (12c and later):
select * from t
match_recognize
(
partition by val
order by id
all rows per match
pattern ( a {- b* -} )
define b as val = a.val and id = a.id
);
Output:
Regards,
Ranagal

Related

Count distinct letters in a string in bigquery

I have a string column in Biquery like:
select 'A'
union all (select 'ab')
union all (select 'abc')
union all (select 'aa')
union all (select 'aab')
I would like to count the number of distinct characters in every row of the column, in this case the results would be:
1
2
3
1
2
Can this be done in BigQuery? How?
How about this (assuming you don't want to differentiate between uppercase and lowercase)...
WITH data AS (select 'A' AS `val`
union all (select 'ab')
union all (select 'abc')
union all (select 'aa')
union all (select 'aab'))
SELECT `val`, 26 - LENGTH(REGEXP_REPLACE('abcdefghijklmnopqrstuvwxyz', '['||LOWER(`val`)||']', ''))
FROM `data`;
A simple approach is to use the SPLIT to convert your string to an array and UNNEST to convert the resulting array to a table. You may then use COUNT and DISTINCT to determine the number of unique characters as shown below:
with my_data AS (
select 'A' as col
union all (select 'ab')
union all (select 'abc')
union all (select 'aa')
union all (select 'aab')
)
select col, (SELECT COUNT(*) FROM (
SELECT DISTINCT element FROM UNNEST(SPLIT(col,'')) as element
)) n from my_data;
or simply
WITH my_data AS (
SELECT 'A' as col UNION ALL
SELECT 'ab' UNION ALL
SELECT 'abc' UNION ALL
SELECT 'aa' UNION ALL
SELECT 'aab'
)
SELECT
col,
(
SELECT
COUNT(DISTINCT element)
FROM
UNNEST(SPLIT(col,'')) as element
) cnt
FROM
my_data;
Like previous but using COUNT with DISTINCT
with my_data AS (
select 'A' as col
union all (select 'ab')
union all (select 'abc')
union all (select 'aa')
union all (select 'aab')
)
select col, COUNT(DISTINCT element) FROM
my_data,UNNEST(SPLIT(col,'')) as element
GROUP BY col
If the data is not quite huge, I would rather go with the user-defined functions to ease up the string manipulation across different columns
CREATE TEMP FUNCTION
get_unique_char_count(x STRING)
RETURNS INT64
LANGUAGE js AS r"""
str_split = new Set(x.split(""));
return str_split.size;
""";
WITH
result AS (
SELECT
'A' AS val
UNION ALL (
SELECT
'ab')
UNION ALL (
SELECT
'abc')
UNION ALL (
SELECT
'aa')
UNION ALL (
SELECT
'aab') )
SELECT
val,
get_unique_char_count(val) unique_char_count
FROM
result
RESULT:

I want to retrieve max value from two column - SQL table

Please find attached image for table structure
You could try using UNION ALL for build a unique column result and select the max from this
select max(col)
from (
select col1 col
from trans_punch
union all
select col2
from trans_punch) t
You can use a common table expression to union the columns, then select the max.
;with cteUnionPunch(Emp_id, both_punch) AS
(
SELECT Emp_id, In_Punch FROM trans_punch
UNION ALL
SELECT Emp_id, Out_Punch FROM trans_punch
)
SELECT Emp_id, max(both_punch) FROM cteUnionPunch GROUP BY Emp_id
You can use apply :
select tp.Emp_id, max(tpp.Punchs)
from trans_punch as tp cross apply
( values (In_Punch), (Out_Punch) ) tpp(Punchs)
group by tp.Emp_id;

SQL assigning incremental ID to subgroups

As the title says, I'm trying to add an extra column to a table which autoincrements everytime a different string in another column changes.
I would like to do this in a query.
Example:
MyCol GroupID
Cable 1
Cable 1
Foo 2
Foo 2
Foo 2
Fuzz 3
Fizz 4
Tv 5
Tv 5
The GroupID column is what I want to accomplish.
We can be sure that MyCol's strings will be the same in each subgroup (Foo will always be Foo, etc).
Thanks in advance
If I understand correctly, you can use dense_rank():
select t.*, dense_rank() over (order by col1) as groupid
from t;
You could make a temporal table with the distinct value of the MyCol and get the groupId throught the RowNumber of the temp table, and join the rownumbered result with your table.
This is a raw example in oracle:
WITH data AS
(SELECT 'Cable' MyCol FROM dual
UNION ALL
SELECT 'Cable' FROM dual
UNION ALL
SELECT 'Foo' FROM dual
UNION ALL
SELECT 'Foo' FROM dual
UNION ALL
SELECT 'Foo' FROM dual
UNION ALL
SELECT 'Fuzz' FROM dual
UNION ALL
SELECT 'Fizz' FROM dual
UNION ALL
SELECT 'Tv' FROM dual
UNION ALL
SELECT 'Tv' FROM dual
),
tablename AS
(SELECT * FROM data
),
temp AS
( SELECT DISTINCT mycol FROM tablename
),
temp2 AS
( SELECT mycol, rownum AS groupid from temp
)
SELECT tablename.mycol, temp2.groupid FROM temp2 JOIN tablename ON temp2.mycol = tablename.mycol
You could also check for a way to implement the tabibitosan method knowing that your column condition is string.

How to create table with multiple rows and columns using only SELECT clause (i.e. using SELECT without FROM clause)

I know that in SQL Server, one can use SELECT clause without FROM clause, and create a table with one row and one column
SELECT 1 AS n;
But I was just wondering, is it possible to use SELECT clause without FROM clause, to create
a table with one column and multiple rows
a table with multiple columns and one row
a table with multiple columns and multiple rows
I have tried many combinations such as
SELECT VALUES(1, 2) AS tableName(n, m);
to no success.
You can do it with CTE and using union(Use union all if you want to display duplicates)
Rextester Sample for all 3 scenarios
One Column and multiple rows
with tbl1(id) as
(select 1 union all
select 2)
select * from tbl1;
One row and multiple columns
with tbl2(id,name) as
(select 1,'A')
select * from tbl2;
Multiple columns and multiple rows
with tbl3(id,name) as
(select 1,'A' union all
select 2,'B')
select * from tbl3;
-- One column, multiple rows.
select 1 as ColumnName union all select 2; -- Without FROM;
select * from ( values ( 1 ), ( 2 ) ) as Placeholder( ColumnName ); -- With FROM.
-- Multiple columns, one row.
select 1 as TheQuestion, 42 as TheAnswer; -- Without FROM.
select * from ( values ( 1, 42 ) ) as Placeholder( TheQuestion, TheAnswer ); -- With FROM.
-- Multiple columns and multiple rows.
select 1 as TheQuestion, 42 as TheAnswer union all select 1492, 12; -- Without FROM.
select * from ( values ( 1, 2 ), ( 2, 4 ) ) as Placeholder( Column1, Column2 ); -- With FROM.
You can do all that by using UNION keyword
create table tablename as select 1 as n,3 as m union select 2 as n,3 as m
In Oracle it will be dual:
create table tablename as select 1 as n,3 as m from dual union select 2 as n,3 as m from dual
You can use UNION operator:
CREATE TABLE AS SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2;
The UNION operator selects only distinct values by default. To allow duplicate values you can use UNION ALL.
The column names in the result-set are usually equal to the column names in the first SELECT statement in the UNION.
try this:
--1) a table with one column and multiple rows
select * into tmptable0 from(
select 'row1col1' as v1
union all
select 'row2col1' as v1
) tmp
--2) a table with multiple columns and one row
select 'row1col1' as v1, 'row1col2' as v2 into tmptable1
--3) a table with multiple columns and multiple rows
select * into tmptable2 from(
select 'row1col1' as v1, 'row1col2' as v2
union all
select 'row2col1' as v2, 'row2col2' as v2
) tmp
One can create a view and later can query it whenever required
-- table with one column and multiple rows
create view vw1 as
(
select 'anyvalue' as col1
union all
select 'anyvalue' as col1
)
select * from vw1
-- table with multiple columns and multiple rows
create view vw2 as
(
select 'anyvalue1' as col1, 'anyvalue1' as col2
union all
select 'anyvalue2' as col1, 'anyvalue2' as col2
)
select * from vw2

How to use order by with union all in sql?

I tried the sql query given below:
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
It results in the following error:
The ORDER BY clause is invalid in views, inline functions, derived
tables, subqueries, and common table expressions, unless TOP or FOR
XML is also specified.
I need to use order by in union all. How do I accomplish this?
SELECT *
FROM
(
SELECT * FROM TABLE_A
UNION ALL
SELECT * FROM TABLE_B
) dum
-- ORDER BY .....
but if you want to have all records from Table_A on the top of the result list, the you can add user define value which you can use for ordering,
SELECT *
FROM
(
SELECT *, 1 sortby FROM TABLE_A
UNION ALL
SELECT *, 2 sortby FROM TABLE_B
) dum
ORDER BY sortby
You don't really need to have parenthesis. You can sort directly:
SELECT *, 1 AS RN FROM TABLE_A
UNION ALL
SELECT *, 2 AS RN FROM TABLE_B
ORDER BY RN, COLUMN_1
Not an OP direct response, but I thought I would jimmy in here responding to the the OP's ERROR messsage, which may point you in another direction entirely!
All these answers are referring to an overall ORDER BY once the record set has been retrieved and you sort the lot.
What if you want to ORDER BY each portion of the UNION independantly, and still have them "joined" in the same SELECT?
SELECT pass1.* FROM
(SELECT TOP 1000 tblA.ID, tblA.CustomerName
FROM TABLE_A AS tblA ORDER BY 2) AS pass1
UNION ALL
SELECT pass2.* FROM
(SELECT TOP 1000 tblB.ID, tblB.CustomerName
FROM TABLE_B AS tblB ORDER BY 2) AS pass2
Note the TOP 1000 is an arbitary number. Use a big enough number to capture all of the data you require.
There will be times when you need to do something like this :
Pull top 5 from table 1 based on a sort
and bottom 5 from table 2 based on another sort
and union these together.
solution
select * from (
-- top 5 records
select top 5 col1, col2, col3
from table1
group by col1, col2
order by col3 desc ) z
union all
select * from (
-- bottom 5 records
select top 5 col1, col2, col3
from table2
group by col1, col2
order by col3 ) z
this was the only way i was able to get around the error and worked fine for me.
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
ORDER BY 2;
2 is column number here .. In Oracle SQL you can use the column number by which you want to sort the data
This solved my SELECT statement:
SELECT * FROM
(SELECT id,name FROM TABLE_A
UNION ALL
SELECT id,name FROM TABLE_B ) dum
order by dum.id , dum.name
where id and name columns available in tables and you can use your columns .
Simply use that , no need parenthesis or anything else
SELECT *, id as TABLE_A_ID FROM TABLE_A
UNION ALL
SELECT *, id as TABLE_B_ID FROM TABLE_B
ORDER BY TABLE_A_ID, TABLE_B_ID
ORDER BY after the last UNION should apply to both datasets joined by union.
The solution shown below:
SELECT *,id AS sameColumn1 FROM Locations
UNION ALL
SELECT *,id AS sameColumn2 FROM Cities
ORDER BY sameColumn1,sameColumn2
select CONCAT(Name, '(',substr(occupation, 1, 1), ')') AS f1
from OCCUPATIONS
union
select temp.str AS f1 from
(select count(occupation) AS counts, occupation, concat('There are a total of ' ,count(occupation) ,' ', lower(occupation),'s.') As str from OCCUPATIONS group by occupation order by counts ASC, occupation ASC
) As temp
order by f1