SQL group by clause with two columns - sql

Have a TSQL view, which I need to group by one column, however, am using nhibernate(C#) and am required to specify the Id column too.. my query looks like:
SELECT
row_number() over(order by id)as Id,
column_name,..etc
from tblName
group by column_name
which gives me an error that the Id has to be included in the group by clause.
Alternatively, I can write:
SELECT
row_number() over(order by id)as Id,
column_name,..etc
from tblName
group by column_name, id
which return multiple rows of the same column_name name.
Is there a way around this?

I think you want to do this:
Select row_number() over(order by column_name) as ID, column_name from (
Select distinct column_name from tblName
) as A

Do you mean this?
SELECT
row_number() over(partition by column_name order by id)as Id,
column_name,..etc
from tblName

Related

Find duplicate ID and add new sequence ID

I have a table where ID must be unique. There are some IDs that are not unique. How do I generate a new column which adds a sequence to this ID? I want to generate ID_new_generated in the table below
ID Company Name ID_new_generated
1 A 1
1 B 1_2
2 C 2
You can use a windowing function (e.g. Rank) to to generate an secondary ID, over each window defined by rows that have the same ID number, then just concatenate it to create the new one.
something like:
select
ID
, companyName
, rank() over(partition by ID ORDER BY companyName)
, concat(ID, '_', rank() over(partition by ID ORDER BY companyName)) as new_id
from test;
See this demo: https://www.db-fiddle.com/f/bd6aQKnZ7gcZCQjFpZicrp/0
Syntax will be different depending on which sql you are using.
Assumed you are looking for a solution in SQL Server:
First you will need to add a nullable column ID_Generated like below:
ALTER TABLE tablename
ADD COLUMN ID_Generated varchar(25) null
GO
Then, use row_number like below in a cte structure (you can use temp table if you are using mysql):
;with cte as (
SELECT DISTINCT t.ID,
(ROW_NUMBER() over (partition by t.ID order by t.ID)) as RowNumber
FROM tablename t
INNER JOIN (select ID, Count(*) RecCount
From tablename
group by ID
having Count(*) > 1) tt on t.ID = t.ID
ORDER BY id ASC
)
Update t
set t.ID_Generated = cte.RowNumber
from tablename t
inner join cte on t.ID = cte.ID
I think you want:
select ID, companyName,
(case when row_number() over (partition by id order by companyname) = 1
then cast(id as varchar(255))
else id || '_' || row_number() over (partition by id order by companyname)
end) as new_id
from test;
|| is the ANSI/ISO standard concatenation operator in SQL. Not all databases support it, so you might need to replace the operator with the one appropriate for your database.

How to replace a DISTINCT ON with GROUP BY in PostgreSQL 9?

I have been using the DISTINCT ON predicate and have decided to replace it with GROUP BY, mainly because it "is not part of the SQL standard and is sometimes considered bad style because of the potentially indeterminate nature of its results".
I am using DISTINCT ON in conjunction with ORDER BY in order to select the latest records in a history table, but it's not clear to me how to do the same with the GROUP BY.
What could be a general approach in order to move from one construct to the other one?
An example could be
SELECT
DISTINCT ON (f1, f2 ) *
FROM table
ORDER BY f1, f2, datefield DESC;
where I get the "latest" pairs of (f1,f2).
If you have a query like this:
select distinct on (col1) t.*
from table t
order by col1, col2
Then you would replace this with window functions, not a group by:
select t.*
from (select t.*,
row_number() over (partition by col1 order by col2) as seqnum
from table t
) t
where seqnum = 1;

Replace function in Oracle SQL

I'm using oracle SQL, and i have the following query:
select replace(replace('count(distinct <thiscol>) over (partition by <nextcol>) / count(*) over () as <thiscol>_<nextcol>,',
'<thiscol>', column_name
), '<nextcol>', lead(column_name) over (order by column_id)
)
from all_tab_columns atc
where table_name = 'mytable'
The output supposed to be queries such as follow:
select id,
count(distinct name2) over (partition by name3) / count(*) over (),
count(distinct name3) over (partition by name4) / count(*) over (),
. . .
from mytable;
I'm expecting to get instead of:
count(distinct name2) over (partition by name3) / count(*) over ()
this query:
count(distinct name3) over (partition by name2) / count(*) over ()
Anyone can advise how to replace the order of the column values? (<thiscol> and <nextcol>). I tried to replace <thiscol> with <nextcol> but it gave me the same result. I tried many other things with of success.
Anyone?
That is really strange. Instead, let's sort in the reverse order:
select replace(replace('count(distinct <thiscol>) over (partition by <nextcol>) / count(*) over () as <thiscol>_<nextcol>,',
'<thiscol>', column_name
), '<nextcol>', lead(column_name) over (order by column_id desc)
)
from all_tab_columns atc
where table_name = 'mytable';
Note the desc in the sort.

Multiple columns in OVER ORDER BY

Is there a way to specify multiple columns in the OVER ORDER BY clause?
SELECT ROW_NUMBER() OVER(ORDER BY (A.Col1)) AS ID FROM MyTable A
The above works fine, but trying to add a second column does not work.
SELECT ROW_NUMBER() OVER(ORDER BY (A.Col1, A.Col2)) AS ID FROM MyTable A
Incorrect syntax near ','.
The problem is the extra parentheses around the column name. These should all work:
-- The standard way
SELECT ROW_NUMBER() OVER(ORDER BY A.Col1) AS ID FROM MyTable A
SELECT ROW_NUMBER() OVER(ORDER BY A.Col1, A.Col2) AS ID FROM MyTable A
-- Works, but unnecessary
SELECT ROW_NUMBER() OVER(ORDER BY (A.Col1), (A.Col2)) AS ID FROM MyTable A
Also, when you ask an SQL question, you should always specify which database you are querying against.
No brackets.
SELECT ROW_NUMBER() OVER(ORDER BY A.Col1, A.Col2) AS ID FROM MyTable A

SQL to find the number of distinct values in a column

I can select all the distinct values in a column in the following ways:
SELECT DISTINCT column_name FROM table_name;
SELECT column_name FROM table_name GROUP BY column_name;
But how do I get the row count from that query? Is a subquery required?
You can use the DISTINCT keyword within the COUNT aggregate function:
SELECT COUNT(DISTINCT column_name) AS some_alias FROM table_name
This will count only the distinct values for that column.
This will give you BOTH the distinct column values and the count of each value. I usually find that I want to know both pieces of information.
SELECT [columnName], count([columnName]) AS CountOf
FROM [tableName]
GROUP BY [columnName]
An sql sum of column_name's unique values and sorted by the frequency:
SELECT column_name, COUNT(*) FROM table_name GROUP BY column_name ORDER BY 2 DESC;
Be aware that Count() ignores null values, so if you need to allow for null as its own distinct value you can do something tricky like:
select count(distinct my_col)
+ count(distinct Case when my_col is null then 1 else null end)
from my_table
/
SELECT COUNT(DISTINCT column_name) FROM table as column_name_count;
you've got to count that distinct col, then give it an alias.
select count(*) from
(
SELECT distinct column1,column2,column3,column4 FROM abcd
) T
This will give count of distinct group of columns.
select Count(distinct columnName) as columnNameCount from tableName
Using following SQL we can get the distinct column value count in Oracle 11g.
select count(distinct(Column_Name)) from TableName
After MS SQL Server 2012, you can use window function too.
SELECT column_name, COUNT(column_name) OVER (PARTITION BY column_name)
FROM table_name
GROUP BY column_name
To do this in Presto using OVER:
SELECT DISTINCT my_col,
count(*) OVER (PARTITION BY my_col
ORDER BY my_col) AS num_rows
FROM my_tbl
Using this OVER based approach is of course optional. In the above SQL, I found specifying DISTINCT and ORDER BY to be necessary.
Caution: As per the docs, using GROUP BY may be more efficient.
select count(distinct(column_name)) AS columndatacount from table_name where somecondition=true
You can use this query, to count different/distinct data.
Without using DISTINCT this is how we could do it-
SELECT COUNT(C)
FROM (SELECT COUNT(column_name) as C
FROM table_name
GROUP BY column_name)
Count(distinct({fieldname})) is redundant
Simply Count({fieldname}) gives you all the distinct values in that table. It will not (as many presume) just give you the Count of the table [i.e. NOT the same as Count(*) from table]