Get a count of times distinct values occur in each column separately - sql

Here is what the source table looks like:
╔══════╦══════╦══════╗
║ COL1 ║ COL2 ║ COL3 ║
╠══════╬══════╬══════╣
║ A ║ A ║ A ║
║ A ║ A ║ B ║
║ A ║ B ║ C ║
║ B ║ B ║ C ║
║ B ║ C ║ C ║
║ C ║ C ║ C ║
╚══════╩══════╩══════╝
I am looking to end up with results like this:
╔════════╦══════╦══════╦══════╗
║ VALUES ║ COL1 ║ COL2 ║ COL3 ║
╠════════╬══════╬══════╬══════╣
║ A ║ 3 ║ 2 ║ 1 ║
║ B ║ 2 ║ 2 ║ 1 ║
║ C ║ 1 ║ 2 ║ 4 ║
╚════════╩══════╩══════╩══════╝
I know this can be done unions, but my table has a large number of columns so I was hoping to find a more elegant solution.

If all values appear in the first column, you can get the counts for the first column with a simple group by and use a cross join and conditional aggregation to get the counts for the other columns
select t1.myvalues, t1.col1,
sum(case when t2.col2 = t1.myvalues then 1 else 0 end) col2,
sum(case when t2.col3 = t1.myvalues then 1 else 0 end) col3
from (
select col1 myvalues, count(*) col1
from Table1 group by col1
) t1 cross join Table1 t2
group by t1.myvalues, t1.col1
http://sqlfiddle.com/#!4/5b35b/1

select 'A' as col,
sum(decode(col1,'A',1,0)) as col1,
sum(decode(col2,'A',1,0)) as col2,
sum(decode(col3,'A',1,0)) as col3
from test_t
union
select 'B' as col,
sum(decode(col1,'B',1,0)) as col1,
sum(decode(col2,'B',1,0)) as col2,
sum(decode(col3,'B',1,0)) as col3
from test_t
union
select 'C' as col,
sum(decode(col1,'C',1,0)) as col1,
sum(decode(col2,'C',1,0)) as col2,
sum(decode(col3,'C',1,0)) as col3
from test_t

Related

How to remove duplicate values from datatable SQL

Getting values duplicate:
╔══════╦══════╦═══════╦════════════╦═════════╦═════════╦══════╦═══════╗
║ ID ║ Name ║ Class ║ Date ║ Intime ║ Outtime ║ INAM ║ OUTPM ║
╠══════╬══════╬═══════╬════════════╬═════════╬═════════╬══════╬═══════╣
║ 1001 ║ Paul ║ 1st ║ 29-11-2022 ║ Holiday ║ Holiday ║ H ║ H ║
╠══════╬══════╬═══════╬════════════╬═════════╬═════════╬══════╬═══════╣
║ 1001 ║ Paul ║ 1st ║ 29-11-2022 ║ Holiday ║ Holiday ║ H ║ H ║
╠══════╬══════╬═══════╬════════════╬═════════╬═════════╬══════╬═══════╣
║ 1001 ║ Paul ║ 1st ║ 29-11-2022 ║ Holiday ║ Holiday ║ H ║ H ║
╚══════╩══════╩═══════╩════════════╩═════════╩═════════╩══════╩═══════╝
Code:
SELECT DISTINCT COALESCE(tt.ID,t1.ID) AS ID,
COALESCE(tt.Name,t1.Name) AS Name,
COALESCE(tt.Class,t1.Class) AS Class,tt.Date,
COALESCE(tt.Intime,t1.Intime) AS Intime,
COALESCE(tt.Outtime,t1.Outtime) AS Outtime,
COALESCE(tt.INAM,t1.INAM) AS INAM,
COALESCE(tt.OUTPM,t1.OUTPM) AS OUTPM
FROM stuattrecordAMPM AS t1
CROSS JOIN (SELECT * FROM stuattrecordAMPM UNION ALL
SELECT null,null,null,Date,Holiday_Name,Holiday_Name,Status,Status FROM HolidayList) AS tt
order by [ID]
DELETE FROM stuattrecordAMPM
WHERE Date IS NULL
In this code I'm getting duplicate values. How to avoid duplicates from datatable?
You can give a row number to each row grouped by all the columns, then delete the rows having row number greater than 1.
Query
with cte as(
select *, row_number() over(
partition by [id], [name], [class], [date], [intime], [outtime], [inam], [outpm]
order by [id]
) as [rn]
from [your_table_name]
)
delete * from cte
where [rn] > 1;

Select info from row to another row

I have this select on a view:
SELECT
T1.ID AS [ID],
T1.A AS [A],
T1.B AS [B],
T4.C AS [C],
ISNULL(NULLIF(T3.D, ''), T2.D) AS [D],
T1.E AS [E],
T1.F AS [F],
T1.G AS [G]
FROM T1
INNER JOIN T2 ON T2.X = T1.X
INNER JOIN T3 ON T3.Y = T2.Y
RIGHT JOIN T4 ON T4.Z = T3.Z
And get this:
║ ID ║ A ║ B ║ C ║ D ║ E ║ F ║ G ║
╠══════╬══════╬══════╬══════╬══════╬══════╬══════╬══════╬
║ 1 ║ 3 ║ 4 ║ 1000 ║ X ║ 1 ║ 1 ║ 1 ║
║ 1 ║ 3 ║ 4 ║ 2000 ║ Y ║ 1 ║ 1 ║ 1 ║
║ NULL ║ NULL ║ NULL ║ 3000 ║ NULL ║ NULL ║ NULL ║ NULL ║
And I want that the last row be like this:
║ 1 ║ 3 ║ 4 ║ 3000 ║ Z ║ 1 ║ 1 ║ 1 ║
That is, the values of all columns equal to the other rows, except column 'C' and column 'D'. The value of column D is obtained from T2.D
How can I do? Thanks.
Without any sample data to use I would suggest lag() may be of use in this exact scenario, such as
select
IsNull(T1.ID, Lag(T1.ID) over(order by T4.C)) as [ID],
IsNull(T1.A, Lag(T1.A) over(order by T4.C))) as [A],
IsNull(T1.B, Lag(T1.B) over(order by T4.C))) as [B],
T4.C as [C],
IsNull(NullIf(T3.D, ''), T2.D) as [D],
IsNull(T1.E, Lag(T1.E) over(order by T4.C))) as [E],
IsNull(T1.F, Lag(T1.F) over(order by T4.C))) as [F],
IsNull(T1.G, Lag(T1.G) over(order by T4.C))) as [G],
from T1
inner join T2 on T2.X = T1.X
inner join T3 on T3.Y = T2.Y
right join T4 on T4.Z = T3.Z

sql sorting by subgroup sum data

How sort this
a 1 15
a 2 3
a 3 34
b 1 55
b 2 44
b 3 8
to (by third column sum):
b 1 55
b 2 44
b 3 8
a 1 15
a 2 3
a 3 34
since (55+44+8) > (15+3+34)
If you are using SQL Server/Oracle/Postgresql you could use windowed SUM:
SELECT *
FROM tab
ORDER BY SUM(col3) OVER(PARTITION BY col) DESC, col2
LiveDemo
Output:
╔═════╦══════╦══════╗
║ col ║ col2 ║ col3 ║
╠═════╬══════╬══════╣
║ b ║ 1 ║ 55 ║
║ b ║ 2 ║ 44 ║
║ b ║ 3 ║ 8 ║
║ a ║ 1 ║ 15 ║
║ a ║ 2 ║ 3 ║
║ a ║ 3 ║ 34 ║
╚═════╩══════╩══════╝
You can do this using ANSI standard window functions. I prefer to use a subquery although this is not strictly necessary:
select col1, col2, col3
from (select t.*, sum(col3) over (partition by col1) as sumcol3
from t
) t
order by sumcol3 desc, col3 desc;
...and an example how to do it without windowing functions, in for example MySQL (but also in just about any other standard SQL version)
SELECT m.col1, m.col2, m.col3
FROM myTable m
JOIN (
SELECT col1, SUM(col3) groupsum FROM myTable GROUP BY col1
) z ON m.col1 = z.col1
ORDER BY z.groupsum DESC, col2;
Basically, calculate the group sum in a subquery and join/order the results by the group's sum descending.
An SQLfiddle to test with.

How to sort a column based on length of data in it in SQL server

As we all know general sorting is using order by. The sort I want to perform is different. I want the smallest length value in middle of table n the largest ones in top and bottom of it. One half should be descending and another half should be ascending. Can you guys help. It was an interview question.
This is one way:
;WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(ORDER BY LEN(YourColumn))
FROM dbo.YourTable
)
SELECT *
FROM CTE
ORDER BY RN%2, (CASE WHEN RN%2 = 0 THEN 1 ELSE -1 END)*RN DESC
Test Data
DECLARE #Table TABLE
(ID INT, Value VARCHAR(10))
INSERT INTO #Table VALUES
(1 , 'A'),
(2 , 'AB'),
(3 , 'ABC'),
(4 , 'ABCD'),
(5 , 'ABCDE'),
(6 , 'ABCDEF'),
(7 , 'ABCDEFG'),
(8 , 'ABCDEFGI'),
(9 , 'ABCDEFGIJ'),
(10 ,'ABCDEFGIJK')
Query
;WITH CTE AS (
SELECT *
,NTILE(2) OVER (ORDER BY LEN(Value) DESC) rn
FROM #Table )
SELECT *
FROM CTE
ORDER BY CASE WHEN rn = 1 THEN LEN(Value) END DESC
,CASE WHEN rn = 2 THEN LEN(Value) END ASC
Result
╔════╦════════════╦════╗
║ ID ║ Value ║ rn ║
╠════╬════════════╬════╣
║ 10 ║ ABCDEFGIJK ║ 1 ║
║ 9 ║ ABCDEFGIJ ║ 1 ║
║ 8 ║ ABCDEFGI ║ 1 ║
║ 7 ║ ABCDEFG ║ 1 ║
║ 6 ║ ABCDEF ║ 1 ║
║ 1 ║ A ║ 2 ║
║ 2 ║ AB ║ 2 ║
║ 3 ║ ABC ║ 2 ║
║ 4 ║ ABCD ║ 2 ║
║ 5 ║ ABCDE ║ 2 ║
╚════╩════════════╩════╝
Here's a short approach that would ge t you started:
WITH cte AS
(
SELECT TOP 1000 number
FROM master..spt_values
WHERE type = 'P' and number >0
)
SELECT number, row_number() OVER(ORDER BY CASE WHEN number %2 = 1 THEN number ELSE -(number) END) pos
FROM cte

SQL - Group rows via criteria until exception is found

I am trying to add a Group column to a data set based on some criteria. For a simple example:
╔════╦══════╗
║ ID ║ DATA ║
╠════╬══════╣
║ 1 ║ 12 ║
║ 2 ║ 20 ║
║ 3 ║ 3 ║
║ 4 ║ 55 ║
║ 5 ║ 11 ║
╚════╩══════╝
Let's say our criteria is that the Data should be greater than 10. Then the result should be similar to:
╔════╦══════╦═══════╗
║ ID ║ DATA ║ GROUP ║
╠════╬══════╬═══════╣
║ 1 ║ 12 ║ 1 ║
║ 2 ║ 20 ║ 1 ║
║ 3 ║ 3 ║ 2 ║
║ 4 ║ 55 ║ 3 ║
║ 5 ║ 11 ║ 3 ║
╚════╩══════╩═══════╝
So, all the rows that satisfied the criteria until an exception to the criteria occurred became part of a group. The numbering of the group doesn't necessarily need to follow this pattern, I just felt like this was a logical/simple numbering to explain the solution I am looking for.
You can calculate the group identifier by finding each row where data <= 10. Then, the group identifier is simply the number of rows where that condition is true, before the given row.
select t.*,
(select count(*)
from t t2
where t2.id <= t.id and
t2.data <= 10
) as groupId
from t;
SQL Server 2012 has cumulative sum syntax. The statement would be simpler in that database:
select t.*,
sum(case when t2.data <= 10) over (order by id) as groupId
from t;
EDIT:
The above does not take into account that the values less than 10 are in their own group. The logic above is that they start a new group.
The following assigns a group id with this constraint:
select t.*,
((select 2*count(*)
from t t2
where t2.id < t.id and
t2.data <= 10
) + (case when t.id <= 10 then 1 else 0 end)
) as groupId
from t;
This can be done easily with a recursive query:
;WITH CTE
AS (SELECT *,
1 AS [GROUP]
FROM TABLEB
WHERE ID = 1
UNION ALL
SELECT T1.ID,
T1.DATA,
CASE
WHEN T1.DATA < 10 THEN T2.[GROUP] + 1
ELSE T2.[GROUP]
END [GROUP]
FROM TABLEB T1
INNER JOIN CTE T2
ON T1.ID = T2.ID + 1)
SELECT *
FROM CTE
A working example can be found on SQL Fiddle.
Good Luck!