sql GROUP within group - sql

Let's say I have this database:
ID | Name | City
1 | John | TLV
2 | Abe | JLM
3 | John | JLM
I want to know how many people with different names are in each city.
I tried to use GROUP BY like this:
SELECT `city`, count(`index`) as `num` FROM `people`
GROUP BY `city`, `name`
But this seems to group by both.
City | num
TLV | 1
JLM | 1
What I want to do is to group by city, and group the results by name.
City | num
TLV | 1
JLM | 2
How can I do this?

I think you want this:
SELECT `city`, count(distinct name) as `num`
FROM `people`
GROUP BY `city`;
You might want just count(name) . . . I'm not sure what you mean by "differently named". count(name) is preferable, if you don't need the distinct.

Related

Get related columns in group by SQL query

I have a database with data that relate to countries, simplified it looks something like this:
ID | country_id | country_ISO | var_value
1 | 1 | FR | 10
2 | 2 | BE | 15
3 | 3 | NL | 20
4 | 1 | FR | 6
5 | 2 | BE | 8
6 | 2 | BE | 12
I would like to get the sum of the values of "var_value", but with that I want to have the country_id as well country_ISO.
I can do this:
SELECT
country_ISO,
SUM(var_value) AS sum_of_value
FROM
table_name
GROUP BY
country_ISO;
This query will give me the sum of var_value and the country ISO, but I also want to get the country_id. How do I subquery/self join to get extra columns that are related (in a unique way) to for example country_ISO?
Since country_iso depends on country_id anyway, just extend the GROUP BY by country_id.
SELECT country_id,
country_iso,
sum(var_value) sum_of_value
FROM table_name
GROUP BY country_id,
country_iso;
Just include that column also in GROUP BY clause :
SELECT country_id, country_ISO, SUM(var_value) AS sum_of_value
FROM table_name tn
GROUP BY country_id, country_ISO;
If contry_id, contry_ISO matched 1:1, then just group by with contry_id, contry_ISO.
SELECT
contry_id,
country_ISO,
SUM(var_value) AS sum_of_value
FROM
table_name
GROUP BY
contry_id,
country_ISO;

Identifying duplicate records in SQL along with primary key

I have a business case scenario where I need to do a lookup into our SQL "Users" table to find out email addresses which are duplicated. I was able to do that by the below query:
SELECT
user_email, COUNT(*) as DuplicateEmails
FROM
Users
GROUP BY
user_email
HAVING
COUNT(*) > 1
ORDER BY
DuplicateEmails DESC
I get an output like this:
user_email DuplicateEmails
--------------------------------
abc#gmail.com 2
xyz#yahoo.com 3
Now I am asked to list out all the duplicate records in a single row of its own and display some additional properties like first name , last name and userID. All this information is stored in this table "Users". I am having difficulty doing so. Can anyone help me or put me toward right direction?
My output needs to look like this:
user_email DuplicateEmails FirstName LastName UserID
------------------------------------------------------------------------------
abc#gmail.com 2 Tim Lentil timLentil
abc#gmail.com 2 John Doe johnDoe12
xyz#yahoo.com 3 brian boss brianTheBoss
xyz#yahoo.com 3 Thomas Hood tHood
xyz#yahoo.com 3 Mark Brown MBrown12
There are several ways you could do this. Here is one using a cte.
with FoundDuplicates as
(
SELECT
uter_email, COUNT(*) as DuplicateEmails
FROM
Users
GROUP BY
uter_email
HAVING
COUNT(*) > 1
)
select fd.user_email
, fd.DuplicateEmails
, u.FirstName
, u.LastName
, u.UserID
from Users u
join FoundDuplicates fd on fd.uter_email = u.uter_email
ORDER BY fd.DuplicateEmails DESC
Use count() over( Partition by ), example
You can solve it like:
DECLARE #T TABLE
(
UserID VARCHAR(20),
FirstName NVARCHAR(45),
LastName NVARCHAR(45),
UserMail VARCHAR(45)
);
INSERT INTO #T (UserMail, FirstName, LastName, UserID) VALUES
('abc#gmail.com', 'Tim', 'Lentil', 'timLentil'),
('abc#gmail.com', 'John', 'Doe', 'johnDoe12'),
('xyz#yahoo.com', 'brian', 'boss', 'brianTheBoss'),
('xyz#yahoo.com', 'Thomas', 'Hood', 'tHood'),
('xyz#yahoo.com', 'Mark', 'Brown', 'MBrown12');
SELECT *, COUNT (1) OVER (PARTITION BY UserMail) MailCount
FROM #T;
Results:
+--------------+-----------+----------+---------------+-----------+
| UserID | FirstName | LastName | UserMail | MailCount |
+--------------+-----------+----------+---------------+-----------+
| timLentil | Tim | Lentil | abc#gmail.com | 2 |
| johnDoe12 | John | Doe | abc#gmail.com | 2 |
| brianTheBoss | brian | boss | xyz#yahoo.com | 3 |
| tHood | Thomas | Hood | xyz#yahoo.com | 3 |
| MBrown12 | Mark | Brown | xyz#yahoo.com | 3 |
+--------------+-----------+----------+---------------+-----------+
Use a window function like this:
SELECT u.*
FROM (SELECT u.*, COUNT(*) OVER (PARTITION BY user_email) as numDuplicateEmails
FROM Users
) u
WHERE numDuplicateEmails > 1
ORDER BY numDuplicateEmails DESC;
I think this will also work.
WITH cte (
SELECT
*
,DuplicateEmails = ROW_NUMBER() OVER (Partition BY user_email ORder by user_email)
FROM Users
)
Select * from CTE
where DuplicateEmails > 1

Sum Decode statement SQL

I am trying to sum a few Decode statements and column names, but am having difficulties.
currently it is showing as
rank | name | points
----------------------
0 | john | 0
0 | john | 40
1 | john | 30
2 | tom | 22
0 | tom | 0
I expect to have this result:
rank | name | points
----------------------
1 | john | 70
2 | tom | 22
Query:
Select Rank, Name, Code, Points
From
(select
decode(Table.name, 'condition1', Table.value) As Points,
decode(Table.name, 'Condition2', Table.value) As Rank,
Employee.name as Name,
Employee.GA1 as Code
from Table
inner Join Employee
on Empolyee.positionseq = name.positionseq
where Table.name IN ('Condition1', 'Condition2')
);
Select MAX(Rank), Name, Code, SUM(Points)
From
(select
decode(Table.name, 'condition1', Table.value) As Points
decode(Table.name, 'Condition2', Table.value) As Rank
,Employee.name as Name
,Employee.GA1 as Code
from Table
inner Join Employee
on Employee.positionseq = name.positionseq
where Table.name IN( 'Condition1', 'Condition2'))
GROUP BY Employee.id;
I added the SUM, MAX (for rank) and GROUP BY statements. Also corrected some misspellings (Empolyee)
I may be understanding your question incorrectly, however, it seems like you are trying to do the following (omitting inner join for simplicity):
Select MAX(rank), name, SUM(points)
FROM UserRanks
GROUP BY name
Based on your data set above, you should get the following results:
rank name points
1 john 70
2 tom 22

How to select all attributes (*) with distinct values in a particular column(s)?

Here is link to the w3school database for learners:
W3School Database
If we execute the following query:
SELECT DISTINCT city FROM Customers
it returns us a list of different City attributes from the table.
What to do if we want to get all the rows like that we get from SELECT * FROM Customers query, with unique value for City attribute in each row.
DISTINCT when used with multiple columns, is applied for all the columns together. So, the set of values of all columns is considered and not just one column.
If you want to have distinct values, then concatenate all the columns, which will make it distinct.
Or, you could group the rows using GROUP BY.
You need to select all values from customers table, where city is unique. So, logically, I came with such query:
SELECT * FROM `customers` WHERE `city` in (SELECT DISTINCT `city` FROM `customers`)
I think you want something like this:
(change PK field to your Customers Table primary key or index like Id)
In SQL Server (and standard SQL)
SELECT
*
FROM (
SELECT
*, ROW_NUMBER() OVER (PARTITION BY City ORDER BY PK) rn
FROM
Customers ) Dt
WHERE
(rn = 1)
In MySQL
SELECT
*
FORM (
SELECT
a.City, a.PK, count(*) as rn
FROM
Customers a
JOIN
Customers b ON a.City = b.City AND a.PK >= b.PK
GROUP BY a.City, a.PK ) As DT
WHERE (rn = 1)
This query -I hope - will return your Cities distinctly and also shows other columns.
You can use GROUP BY clause for getting distinct values in a particular column. Consider the following table - 'contact':
+---------+------+---------+
| id | name | city |
+---------+------+---------+
| 1 | ABC | Chennai |
+---------+------+---------+
| 2 | PQR | Chennai |
+---------+------+---------+
| 3 | XYZ | Mumbai |
+---------+------+---------+
To select all columns with distinct values in City attribute, use the following query:
SELECT *
FROM contact
GROUP BY city;
This will give you the output as follows:
+---------+------+---------+
| id | name | city |
+---------+------+---------+
| 1 | ABC | Chennai |
+---------+------+---------+
| 3 | XYZ | Mumbai |
+---------+------+---------+

How to get the previous row-text

First of all: I am a SQL beginner and I use SQL Server 2008.
The tables as it is now, is written as:
SELECT
Transaction.description, Person.name
FROM
Transaction, Person, SystemUser
WHERE
Person.personnumber = SystemUser.personnumber
AND Transaction.art_ID = SystemUser.art_ID
ORDER BY
Transaction.description
where personnumber is PK nvarchar (could look like N0890) where the last numbers of it grows with +1 for every new person.
art_ID (Transaction) is PK smallint, art_ID (SystemUser) is smallint, description is nvarchar.
I want to get the text from the previous row, in the same column, so that I can manipulate the text to be clear and make the result-table look more simple.
Example as it is now:
|Transactions | Persons |
|-------------------|----------|
|Statistic | Ursula |
|Statistic | Peter |
|Statistic | Alan |
|Settlement | Christie |
|Settlement | Tania |
|Deptor department | Jack |
|Economy department | Rickie |
|Economy department | Annie |
|Economy department | Tom |
|Economy department | Seth |
How I want it to be:
|Transactions | Persons |
|-------------------|----------|
|Statistic | Ursula |
| | Peter |
| | Alan |
|Settlement | Christie |
| | Tania |
|Deptor department | Jack |
|Economy department | Rickie |
| | Annie |
| | Tom |
| | Seth |
as in select case when description = description - 1 row then ''
I have searched for examples and every one of them are based on integers, not varchar/nvarchar), and I keep getting errors when i try to do it with varchars. Such as With CTE, min() and max().
Do you have any ideas of what function I can use or how to set up the select-statement to do as I want?
First use a rank function to identify just one of them:
SELECT Transaction.description, Person.name,
RANK() OVER (PARTITION BY Transaction.description ORDER BY Person.name) As R
FROM Transaction, Person, SystemUser
WHERE Person.personnumber = SystemUser.personnumber
AND Transaction.art_ID = SystemUser.art_ID
ORDER BY Transaction.description, Person.name
Notice the lines you want to see have 1 against them? Use that:
SELECT
CASE WHEN R=1 THEN Transaction.description ELSE '' END description,
Person.name
FROM
(
SELECT Transaction.description, Person.name,
RANK() OVER (PARTITION BY Transaction.description ORDER BY Person.name) As R
FROM Transaction, Person, SystemUser
WHERE Person.personnumber = SystemUser.personnumber
AND Transaction.art_ID = SystemUser.art_ID
) Subtable
ORDER BY Transaction.description, Person.name
I think following SQL should work
CREATE TABLE #TempTable (rowrank INT, description VARCHAR(256), name VARCHAR(256));
INSERT INTO #TempTable (rowrank, description, name)
VALUES
Select RANK() OVER (ORDER BY Transaction.description)
,Transaction.description
,name
FROM Transaction, Person, SystemUser
WHERE Person.personnumber = SystemUser.personnumber
AND Transaction.art_ID = SystemUser.art_ID
ORDER BY Transaction.description
SELECT
CASE
WHEN prev.RANK = TT.RANK
THEN ""
ELSE TT.Description
END AS Description,
name
FROM #TempTable TT
LEFT JOIN #TempTable prev ON prev.rownum = TT.rownum - 1