SQL - Identify Distinct Values Including Count For ALL Columns In A Table - sql

I want to be able to identify the distinct values including a count of the value for each column in a table.
I reviewed - Get distinct records with counts
And it shows me how to do this for an individual column and works great. However, I have a table with over 600 columns, and coding each column would be incredibly time consuming.
Is there a way to code my sql where I could get these same results for all columns in a table, without having to individually input each column?
So to use the example from the link:
personid, msg
-------------
1, 'msg1'
2, 'msg2'
2, 'msg3'
3, 'msg4'
1, 'msg2'
My results would be:
personid, count | msg, count
-----------------------------
1, 2 | msg1, 1
2, 2 | msg2, 2
3, 1 | msg3, 1
_, _ | msg4, 1
Is this possible? I've tried getting at it using distincts and wildcards (*) but no luck.
Apologize if this isn't detailed enough, this is my first post and I'm no SQL expert, and Googling hasn't found an answer. Thanks.

I am not sure that it convinient, but you can do it like this:
CREATE TABLE #temp (
personid int,
message nvarchar(max)
);
GO
INSERT INTO #temp
SELECT 1, 'msg1' UNION ALL
SELECT 2, 'msg2' UNION ALL
SELECT 2, 'msg3' UNION ALL
SELECT 3, 'msg4' UNION ALL
SELECT 1, 'msg2';
GO
SELECT
isnull(t1.rn, t2.rn) as rn,
t1.personid as personid, t1.cnt as personid_cnt,
t2.message as message, t2.cnt as message_cnt
FROM
(SELECT personid, count(*) as cnt,
ROW_NUMBER() over (order by personid) as rn
FROM #temp GROUP BY personid) t1
FULL JOIN
(SELECT message, count(*) as cnt,
ROW_NUMBER() over (order by message) as rn
FROM #temp GROUP BY message) t2
ON t1.rn = t2.rn
ORDER BY rn
DROP table #temp;
result:
rn personid personid_cnt message message_cnt
1 1 2 msg1 1
2 2 2 msg2 2
3 3 1 msg3 1
4 NULL NULL msg4 1

Related

How to count distinct rows and get data of the row and count of it as a second column

Let's say I have a data
ID
AAA
ABB
ABC
BDS
BRD
CXD
DCU
ETS
I would like to count distinct to a first letter rows and get the number of their appearance to the right. Sorry I know I am not a very good user of a technical language, but I am new to SQL and English is not my first language.
So by script I would like to return
ID Total
A 3
B 2
C 1
D 1
E 1
I have tried
select left(id,1), count(left(id,1) as Total
from Places
group by Id
order by Total desc;
, but it didn't work. Your help will be greatly appreciated.
select left(id,1), count(*) as Total
from Places
group by left(id,1)
order by Total desc;
Is this you need?
declare #t table(val varchar(10))
insert into #t
select 'AAA' union all
select 'ABB' union all
select 'ABC' union all
select 'BDS' union all
select 'BRD' union all
select 'CXD' union all
select 'DCU' union all
select 'ETS'
select left(t1.val,1) as id ,count(t1.val) as total from #t as t1 left join
(
select distinct right(val,1) as val from #t
) as t2 on t1.val =t2.val
group by left(t1.val,1)
Result is
id total
---- -----------
A 3
B 2
C 1
D 1
E 1

SQL Server Select Get Single Row From Another Table

Needing some help with SQL Server select query here.
I have the following tables defined:
UserSource
UserSourceID ID Name Dept SourceID
1 1 John AAAA 1
2 1 John AAAA 2
3 2 Nena BBBB 1
4 2 Nena BBBB 2
5 3 Gord AAAA 2
6 3 Gord AAAA 1
7 4 Stan CCCC 3
Source
SourceID Description RankOrder
1 FromHR 1
2 FromTemp 2
3 Others 3
Need to join both tables and select only the row where the rank is the smallest. Such that the resulting row would be:
UserSourceID ID Name Dept SourceID Description RankOrder
1 1 John AAAA 1 FromHR 1
3 2 Nena BBBB 1 FromHR 1
6 3 Gord AAAA 1 FromHR 1
7 4 Stan CCCC 3 Others 3
TIA.
Edit:
Here's what I have come up so far, but I seem to be missing something:
WITH
TableA AS(
SELECT 1 AS UserSourceID, 1 AS ID, 'John' AS [Name], 'AAAA' as [Dept], 1 as SourceID
UNION SELECT 2, 1, 'John', 'AAAA', 2
UNION SELECT 3, 2, 'Nena', 'BBBB', 1
UNION SELECT 4, 2, 'Nena', 'BBBB', 2
UNION SELECT 5, 3, 'Gord', 'AAAA', 2
UNION SELECT 6, 3, 'Gord', 'AAAA', 1
UNION SELECT 7, 4, 'Stan', 'DDDD', 3)
,
TableB AS(
SELECT 1 as SourceID, 'FromHR' as [Description], 1 as RankOrder
UNION SELECT 2, 'FromTemp', 2
UNION SELECT 3, 'Others', 3
)
SELECT DISTINCT tblA.*, tblB.SourceID, tblB.Description
FROM TableB tblB
JOIN TableA tblA ON tblA.SourceID = tblB.SourceID
LEFT JOIN TableB b2 ON b2.SourceID = tblB.SourceID
AND B2.RankOrder < tblB.RankOrder
WHERE B2.SourceID IS NULL
UPDATE:
I scanned the tables and there might be some variations of data. I have updated the data for the question as above.
Practically, I need to join these two tables, and be able to only select the row which would have the least RankOrder. In case of record UserSourceID = 7, that particular record would be selected because there's only one row that exists after the tables have been joined.
I use windowed aggregates for this type of solution pretty regularly. ROW_NUMBER will order and number the rows based on the PARTITION and ORDER you specify in the OVER clause.
select UserSoruceID
, ID
, Name
, Dept
, SourceID
, Description
, RankOrder
FROM (SELECT UserSoruceID
, ID
, Name
, Dept
, u.SourceID
, Description
, RankOrder
, ROW_NUMBER() over(PARTITION BY ID ORDER BY RankOrder) ranknum
FROM UserSource u
INNER JOIN
Source s
on s.SourceID = u.SourceID ) a
WHERE ranknum = 1
So in this case, for every ID, number the rows based on RankOrder, and then filter where so you only view the first row.
Here's a helpful link to that function from Microsoft. ROW_NUMBER
----UPDATE----
Here's with Rank and Row Number as options.
select UserSoruceID
, ID
, Name
, Dept
, SourceID
, Description
, RankOrder
FROM (SELECT UserSoruceID
, ID
, Name
, Dept
, u.SourceID
, Description
, RankOrder
, ROW_NUMBER() over(PARTITION BY ID ORDER BY RankOrder) row_num
, RANK() over(PARTITION BY ID ORDER BY RankOrder) rank_num --use this if you want to see the duplicate records
FROM UserSource u
INNER JOIN
Source s
on s.SourceID = u.SourceID ) a
WHERE row_num = 1 --rank_num = 1
Replace row_num with rank_num to view any items with duplicate RankOrder entries

Merging data in a single SQL table without a Cursor

I have a table with an ID column and another column with a number. One ID can have multiple numbers. For example
ID | Number
1 | 25
1 | 26
1 | 30
1 | 24
2 | 4
2 | 8
2 | 5
Now based of this data, in a new table, I want to have this
ID | Low | High
1 | 24 | 26
1 | 30 | 30
2 | 4 | 5
2 | 8 | 8
As you can see, I want to merge any data where the numbers are consecutive, like 24, 25, 26. So now the low was 24, the high was 26, and then 30 is still a separate range. I am dealing with large amounts of data, so I would prefer to not use a cursor for performance sake (which is what I was previously doing, and was slowing things down quite a bit)...What is the best way to achieve this? I'm no SQL pro, so I'm not sure if there is a function available that could make this easier, or what the fastest way to accomplish this would be.
Thanks for the help.
The key observation is that a sequence of numbers minus another sequence is a constant. We can generate another sequence using row_number. This identifies all the groups:
select id, MIN(number) as low, MAX(number) as high
from (select t.*,
(number - ROW_NUMBER() over (partition by id order by number) ) as groupnum
from t
) t
group by id, groupnum
The rest is just aggregation.
Solution with CTE and recursion:
WITH CTE AS (
SELECT T.ID, T.NUMBER, T.NUMBER AS GRP
FROM T
LEFT OUTER JOIN T T2 ON T.ID = T2.ID AND T.NUMBER -1 = T2.NUMBER
WHERE T2.ID IS NULL
UNION ALL
SELECT T.ID, T.NUMBER, GRP
FROM CTE
INNER JOIN T
ON T.ID = CTE.ID AND T.NUMBER = CTE.NUMBER + 1
)
SELECT ID, MAX( NUMBER ), MIN(NUMBER)
FROM CTE
GROUP BY ID, GRP
Results at fiddlesql
I'd suggest using a WHILE loop structure with a table variable instead of the cursor.
For example,
DECLARE #TableVariable TABLE
(
MyID int IDENTITY (1, 1) PRIMARY KEY NOT NULL,
[ID] int,
[Number] int
)
DECLARE #Count int, #Max int
INSERT INTO #TableVariable (ID, Number)
SELECT ID, Number
FROM YourSourceTable
SELECT #Count = 1, #Max = MAX(MyID)
FROM #TableVariable
WHILE #Count <= #Max
BEGIN
...do your processing here...
SET #Count = #Count + 1
END
CREATE TABLE Table1
([ID] int, [Number] int)
;
INSERT INTO Table1
([ID], [Number])
VALUES
(1, 25),
(1, 26),
(1, 30),
(1, 24),
(2, 4),
(2, 8),
(2, 5)
;
select ID,
MIN(Number)
,(SELECT MIN(Number)
FROM (SELECT TOP 2 Number from Table1 WHERE ID =
T1.Id ORDER BY Number DESC) as DT)
from Table1 as T1
GROUP BY ID
UNION
SELECT ID, MAX(Number), MAX(Number)
FROM Table1 as T1
GROUP BY ID;
Live Example

In Oracle, how do I get a page of distinct values from sorted results?

I have 2 columns in a one-to-many relationship. I want to sort on the "many" and return the first occurrence of the "one". I need to page through the data so, for example, I need to be able to get the 3rd group of 10 unique "one" values.
I have a query like this:
SELECT id, name
FROM table1
INNER JOIN table2 ON table2.fkid = table1.id
ORDER BY name, id;
There can be multiple rows in table2 for each row in table1.
The results of my query look like this:
id | name
----------------
2 | apple
23 | banana
77 | cranberry
23 | dark chocolate
8 | egg
2 | yak
19 | zebra
I need to page through the result set with each page containing n unique ids. For example, if start=1 and n=4 I want to get back
2
23
77
8
in the order they were sorted on (i.e., name), where id is returned in the position of its first occurrence. Likewise if start=3 and n=4 and order = desc I want
8
23
77
2
I tried this:
SELECT * FROM (
SELECT id, ROWNUM rnum FROM (
SELECT DISTINCT id FROM (
SELECT id, name
FROM table1
INNER JOIN table2 ON table2.fkid = table1.id
ORDER BY name, id)
WHERE ROWNUM <= 4)
WHERE rnum >=1)
which gave me the ids in numerical order, instead of being ordered as the names would be.
I also tried:
SELECT * FROM (
SELECT DISTINCT id, ROWNUM rnum FROM (
SELECT id FROM (
SELECT id, name
FROM table1
INNER JOIN table2 ON table2.fkid = table1.id
ORDER BY name, id)
WHERE ROWNUM <= 4)
WHERE rnum >=1)
but that gave me duplicate values.
How can I page through the results of this data? I just need the ids, nothing from the "many" table.
update
I suppose I'm getting closer with changing my inner query to
SELECT id, name, rank() over (order by name, id)
FROM table1
INNER JOIN table2 ON table2.fkid = table1.id
...but I'm still getting duplicate ids.
You may need to debug it a little, but but it will be something like this:
SELECT * FROM (
SELECT * FROM (
SELECT id FROM (
SELECT id, name, row_number() over (partition by id order by name) rn
FROM table1
INNER JOIN table2 ON table2.fkid = table1.id
)
) WHERE rn=1 ORDER BY name, id
) WHERE rownum>=1 and rownum<=4;
It's a bit convoluted (and I would tend to suspect that it could be simplified) but it should work. You'd can put whatever start and end position you want in the WHERE clause-- I'm showing here with start=2 and n=4 are pulled from a separate table but you could simplify things by using a couple of parameters instead.
SQL> ed
Wrote file afiedt.buf
1 with t as (
2 select 2 id, 'apple' name from dual union all
3 select 23, 'banana' from dual union all
4 select 77, 'cranberry' from dual union all
5 select 23, 'dark chocolate' from dual union all
6 select 8, 'egg' from dual union all
7 select 2, 'yak' from dual union all
8 select 19, 'zebra' from dual
9 ),
10 x as (
11 select 2 start_pos, 4 n from dual
12 )
13 select *
14 from (
15 select distinct
16 id,
17 dense_rank() over (order by min_id_rnk) outer_rnk
18 from (
19 select id,
20 min(rnk) over (partition by id) min_id_rnk
21 from (
22 select id,
23 name,
24 rank() over (order by name) rnk
25 from t
26 )
27 )
28 )
29 where outer_rnk between (select start_pos from x) and (select start_pos+n-1 from x)
30* order by outer_rnk
SQL> /
ID OUTER_RNK
---------- ----------
23 2
77 3
8 4
19 5

tsql - Setting sequential values without looping/cursoring

I need to set a non-unique identifier in a data table. This would be sequential within a group ie. for each group, the ID should start at 1 and rise in incremements of 1 until the last row for that group.
This is illustrated by the table below. "New ID" is the column I need to populate.
Unique ID Group ID New ID
--------- -------- ------
1 1123 1
2 1123 2
3 1124 1
4 1125 1
5 1125 2
6 1125 3
7 1125 4
Is there any way of doing this without looping/cursoring? If looping/cursoring is the only way, what would the most efficient code be?
Thanks
One method is to use ROW_NUMBER() OVER(PARTITION BY ... ORDER BY ...) in an UPDATE...FROM statement with a subquery in the FROM clause.
update MyTable set NewID = B.NewID
from
MyTable as A
inner join (select UniqueID, ROW_NUMBER() over (partition by GroupID order by UniqueID) as NewID from MyTable) as B on B.UniqueID = A.UniqueID
MSDN has a good sample to get you started:
You need to utilize a subquery in the FROM clause in order to utilize a windows function (Row_Index())
Partition By tells the server when to reset the row numbers
Order By tells the server which way to order the group's NewID's
I agree with Damien's point in the comments but you don't need a JOIN you can just update the CTE directly.
;WITH cte AS
(
SELECT [New ID],
ROW_NUMBER() OVER (PARTITION BY [Group ID] ORDER BY [Unique ID]) AS _NewID
FROM #T
)
UPDATE cte
SET [New ID] = _NewID
Online Demo
Alternate to RowNumber() if you're on SS 2000
SELECT UniqueID,
GroupID,
(SELECT COUNT(T2.GroupID)
FROM myTable T2
WHERE GroupID <= T1.GroupID) AS NewID
FROM myTable T1
This solution will also work, if you are running an old version of mssql
--Test table:
DECLARE #t table(Unique_ID int, Group_ID int, New_ID int)
--Test data:
INSERT #t (unique_id, group_id)
SELECT 1, 1123 UNION ALL SELECT 2, 1123 UNION ALL SELECT 3, 1124 UNION ALL SELECT 4, 1125 UNION ALL SELECT 5, 1125 UNION ALL SELECT 6, 1125 UNION ALL SELECT 7, 1125
--Syntax:
UPDATE t
SET new_id =
(SELECT count(*)
FROM #t
WHERE t.unique_id >= unique_id and t.group_id = group_id
GROUP BY group_id)
FROM #t t
--Result:
SELECT * FROM #t
Unique_ID Group_ID New_ID
----------- ----------- -----------
1 1123 1
2 1123 2
3 1124 1
4 1125 1
5 1125 2
6 1125 3
7 1125 4
SELECT
UniqueId,
GroupID,
ROW_NUMBER() OVER (PARTITION BY GroupId ORDER BY UniqueId) AS NewIdx
FROM
....