Multiple group by row extracting in SQL - sql

let me tell you my sample data structure first.
CREATE TABLE foobar (
id int primary key,
lastname varchar(100) not null,
firstname varchar(100) not null,
val1 int not null,
val2 int not null
)
imagine sample entries like that:
1, Smith, Bob, 1, 3
2, Smith, Bob 2, 1
3, SMith, Allen , 3, 4
i.e. Firstname, Lastname are not distinct here
Now see the query:
select lastname, firstname, avg(val1), avg(val2)
from foobar
where lastname = 'Smith'
group by firstname
sample output:
1, Smith, Bob, 1.5 , 2
2, Smith, Allen , 3, 4
Now i want three different things.
1. get me all smith where avg(val1) is the biggest value of all
2. get me all smith where avg(val1) is the lowest value of all
sample:
1. 1.5 is smallest value therefore get me Bob
2. 3 is the biggest value therefore get me Allen
I don't know how to do this efficiently.
My approach was to save the min and max of avg() and then join into the same table on that value. But i fell like this is a bad solution.
Is there some efficient way?

I'm not sure what you mean biggest value of all. By definition, an average will never be the largest value, but instead, the median of all?
If you mean to just return the largest and smallest entries in the table, use something like this:
select lastname, firstname, max(val1), val2
from foobar
where lastname = 'Smith'
group by firstname
select lastname, firstname, min(val1), val2
from foobar
where lastname = 'Smith'
group by firstname

Related

Merging Duplicate Rows with SQL

I have a table that contains usernames, these names are duplicated in various forms, for example, Mr. John is replicated as John Mr. I want to combine the two rows using their unique phone numbers in SQL.
I want a new table in this form after removing the duplicates
you can do it with ROW_NUMBER window function.
First, you need to group the data by your unique column (Phone_Number), then sort by name.
Preparing the table and example data:
DECLARE #vCustomers TABLE (
Name NVARCHAR(25),
Phone_Number NVARCHAR(9),
Address NVARCHAR(25)
)
INSERT INTO #vCustomers
VALUES
('Mr John', '234881675', 'Lagos'),
('Mr Felix', '234867467', 'Atlanta'),
('Mrs Ayo', '234786959', 'Doha'),
('John Mr', '234881675', 'Lagos'),
('Mr Jude', '235689760', 'Rabat'),
('Ayo', '234786959', 'Doha'),
('Jude', '235689760', 'Rabat')
After that, removing the duplicate rows:
DELETE
vc
FROM (
SELECT
ROW_NUMBER() OVER(PARTITION BY Phone_Number ORDER BY Name DESC) AS RN
FROM #vCustomers
) AS vc
WHERE RN > 1
SELECT * FROM #vCustomers
As final, the result:
Name
Phone_Number
Address
Mr John
234881675
Lagos
Mr Felix
234867467
Atlanta
Mrs Ayo
234786959
Doha
Mr Jude
235689760
Rabat

Trying to show Count of Person, Spouse, Kids

I am doing
select
count(distinct(ssn))
, count (distinct(ssn + spousename))
, count distinct((ssn + spousename + kidname))
My problem is that if the spouse name or kid name are blank there is no spouse or kid so they should not be counted.
How would you bypass a blank value from the count?
data:
SSN Person Spouse Child
111-11.... John (no spouse no kid)
222-22.....Jane Jim Jack
333-33.....Jerry Jack (no spouse)
333-33.....Jerry Jill (no spouse second kid)
444-44..... John Judy (no kid)
My answer should be 4 people, 2 spouses and 3 kids because I am doing a count of unique values that don't include blank.
I can't show the real SSN and names so It looks like fake data
Thank you!
you may try COALESCE
select
count(distinct(ssn))
, count (distinct(ssn + spousename))
, count distinct((ssn + spousename + coalesce(kidname,'')))
Try this:
SELECT
COUNT(ssn) AS people,
SUM(spouses) AS spouses,
SUM(children) AS children
FROM
(
SELECT
ssn,
MAX(person) AS person,
COUNT(spouse) AS spouses,
COUNT(child) AS children
FROM
TableName
GROUP BY
ssn
) BySSN
;
Providing your data in an accurate (and of course made up!), and consumable (DDL+DML) format is critical to getting the correct answer.
Knowing whether you use null or blank is makes a big difference also.
I am a big fan of counting exactly what it is you want to count, rather than relying on distinct.
declare #MyData table (SSN varchar(6), Person varchar(12), Spouse varchar(12), Child varchar(12));
insert into #MyData (SSN, Person, Spouse, Child)
values
('111-11', 'John', null, null),
('222-22', 'Jane', 'Jim', 'Jack'),
('333-33', 'Jerry', null, 'Jack'),
('333-33', 'Jerry', null, 'Jill'),
('444-44', 'John', 'Judy', null);
with cte as (
select count(Spouse) Spouse, count(Child) Child
from #MyData
group by SSN
)
select count(*) [Num People], sum(Spouse) Spouses], sum(Child) [Num Children]
from cte;
Returns:
Num People
Num Spouses
Num Children
4
2
3

How can I use SQL Pivot Table to turn my rows of data into columns

In SQL 2008, I need to flatten a table and show extra rows as columns. All I can find are queries with calculations. I just want to show the raw data. The data is like as below (simplified):
ID# Name Name_Type
1 Mary Jane Legal
1 MJ Nickname
1 Smith Maiden
2 John Legal
3 Suzanne Legal
3 Susie Nickname
I want the data to show as:
ID# Legal Nickname Maiden
1 Mary Jane MJ Smith
2 John
3 Suzanne Susie
where nothing shows in the column if there is not a row existing for that column. I'm thinking the Pivot Table method should work.
PIVOT requires you to use an aggregate. See this post for a better explanation of how it works.
CREATE TABLE #MyTable
(
ID# INT
, Name VARCHAR(50)
, Name_Type VARCHAR(50)
);
INSERT INTO #MyTable VALUES
(1, 'Mary Jane', 'Legal')
, (1, 'MJ', 'Nickname')
, (1, 'Smith', 'Maiden')
, (2, 'John', 'Legal')
, (3, 'Suzanne', 'Legal')
, (3, 'Susie', 'Nickname');
SELECT *
FROM
(
SELECT * FROM #MyTable
) AS Names
PIVOT (MAX(NAME)
FOR Name_Type IN ([Legal], [Nickname], [Maiden]))
AS PVT;
DROP TABLE #MyTable;
Try this (replace "new_Table" with your table - name and "ID_" with your id - column):
SELECT ID_ AS rootID, (
SELECT Name
FROM new_table
WHERE Name_type = 'legal'
AND new_table.ID_ = rootID
) AS legal,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Nickname'
AND ID_ = rootID
) AS Nickname,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Maiden'
AND ID_ = rootID
) AS Maiden
FROM new_table
GROUP BY rootID;

ms-access 2010: count duplicate names per household address

I am currently working with a spreadsheet in MS Access 2010 which contains about 130k rows of information about people who voted in a local election recently. Each row has their residential information (street name, number, postcode etc.) and personal information (title, surname, forename, middle name, DOB etc.). Each row represents an individual person rather than a household (therefore in many cases the same residential address appears more than once as more than one person resides in a particular household).
What I want to achieve is basically to create a new field in this dataset called 'count'. I want this field to give me a count of how many different surnames reside at a single address.
Is there an SQL script that will allow me to do this in Access 2010?
+------------------+----------+-------+---------+----------+-------------+
| PROPERTYADDRESS1 | POSTCODE | TITLE | SURNAME | FORENAME | MIDDLE_NAME |
+------------------+----------+-------+---------+----------+-------------+
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N
FAKEADDRESS2 EEE 5BB MRS BLOGGS SUZANNE P
FAKEADDRESS3 EEE 5RG MS SMITH PAULINE S
FAKEADDRESS4 EEE 4BV DR JONES ANNE D
FAKEADDRESS5 EEE 3AS MR TAYLOR STUART A
The following syntax has got me close so far:
SELECT COUNT(electoral.SURNAME)
FROM electoral
GROUP BY electoral.UPRN
However, instead of returning me all 130k odd rows, it only returns me around 67k rows. Is there anything I can do to the syntax to achieve the same result, but just returning every single row?
Any help is greatly appreciated!
Thanks
You could use something like this:
select *,
count(surname) over (partition by householdName)
from myTable
If you have only one column which contains the name,
ex: Rob Adams
then you can do this to have all the surnames in a different column so it will be easier in the select:
SELECT LEFT('HELLO WORLD',CHARINDEX(' ','HELLO WORLD')-1)
in our example:
select right (surmane, charindex (' ',surname)-1) as surname
example on how to use charindex, left and right here:
http://social.technet.microsoft.com/wiki/contents/articles/17948.t-sql-right-left-substring-and-charindex-functions.aspx
if there are any questions, leave a comment.
EDIT: I edited the query, had a syntax error, please try it again. This works on sql server.
here is an example:
create table #temp (id int, PropertyAddress varchar(50), surname varchar(50), forname varchar(50))
insert into #temp values
(1, 'hiddenBase', 'Adamns' , 'Kara' ),
(2, 'hiddenBase', 'Adamns' , 'Anne' ),
(3, 'hiddenBase', 'Adamns' , 'John' ),
(4, 'QueensResidence', 'Queen' , 'Oliver' ),
(5, 'QueensResidence', 'Queen' , 'Moira' ),
(6, 'superSecretBase', 'Diggle' , 'John' ),
(7, 'NandaParbat', 'Merlin' , 'Malcom' )
select * from #temp
select *,
count (surname) over (partition by PropertyAddress) as CountMembers
from #temp
gives:
1 hiddenBase Adamns Kara 3
2 hiddenBase Adamns Anne 3
3 hiddenBase Adamns John 3
7 NandaParbat Merlin Malcom 1
4 QueensResidence Queen Oliver 2
5 QueensResidence Queen Moira 2
6 superSecretBase Diggle John 1
Your query should look like this:
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
EDIT
If over partition by isn't supported, then I guess you can get to your desired result by using group by
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
group by -- put here the fields in the select (one by one), however you can't write group by *
GROUP BY creates an aggregate query, so it's by design that you get fewer records (one per UPRN).
To get the count for each row in the original table, you can join the table with the aggregate query:
SELECT electoral.*, elCount.NumberOfPeople
FROM electoral
INNER JOIN
(
SELECT UPRN, COUNT(*) AS NumberOfPeople
FROM electoral
GROUP BY UPRN
) AS elCount
ON electoral.UPRN = elCount.UPRN
Given the update I want to post another answer. Try it like this:
create table #temp2 ( PropertyAddress1 varchar(50), POSTCODE varchar(20), TITLE varchar (20),
surname varchar(50), FORENAME varchar(50), MIDDLE_NAME varchar (50) )
insert into #temp2 values
('FAKEADDRESS1', 'EEE 5GG', 'MR', 'BLOGGS', 'JOE', 'N'),
('FAKEADDRESS1', 'EEE 5BB', 'MRS', 'BLOGGS', 'SUZANNE', 'P'),
('FAKEADDRESS2', 'EEE 5RG', 'MS', 'SMITH', 'PAULINE', 'S'),
('FAKEADDRESS3', 'EEE 4BV', 'DR', 'JONES', 'ANNE', 'D'),
('FAKEADDRESS4', 'EEE 3AS', 'MR', 'TAYLOR', 'STUART', 'A')
select PropertyAddress1, surname,count (#temp2.surname) as CountADD
into #countTemp
from #temp2
group by PropertyAddress1, surname
select * from #temp2 t2
left join #countTemp ct
on t2.PropertyAddress1 = ct.PropertyAddress1 and t2.surname = ct.surname
This yields:
PropertyAddress1 POSTCODE TITLE surname FORENAME MIDDLE_NAME PropertyAddress1 surname CountADD
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N FAKEADDRESS1 BLOGGS 2
FAKEADDRESS1 EEE 5BB MRS BLOGGS SUZANNE P FAKEADDRESS1 BLOGGS 2
FAKEADDRESS2 EEE 5RG MS SMITH PAULINE S FAKEADDRESS2 SMITH 1
FAKEADDRESS3 EEE 4BV DR JONES ANNE D FAKEADDRESS3 JONES 1
FAKEADDRESS4 EEE 3AS MR TAYLOR STUART A FAKEADDRESS4 TAYLOR 1

How to add table column headings to sql select statement

I have a SQL select statement like this:
select FirstName, LastName, Age from People
This will return me something like a table:
Peter Smith 34
John Walker 46
Pat Benetar 57
What I want is to insert the column headings into the first row like:
First Name Last Name Age
=========== ========== ====
Peter Smith 34
John Walker 46
Pat Benetar 57
Can someone suggest how this could be achieved?
Could you maybe create a temporary table with the headings and append the data one to this?
Neither of the answers above will work, unless all your names come after "first" in sort order.
Select FirstName, LastName
from (
select Sorter = 1, FirstName, LastName from People
union all
select 0, 'FirstName', 'LastName') X
order by Sorter, FirstName -- or whatever ordering you need
If you want to do this to all non-varchar columns as well, the CONS are (at least):
ALL your data will become VARCHAR. If you use Visual Studio for example, you will NO LONGER be able to recognize or use date values. Or int values. Or any other for that matter.
You need to explicitly provide a format to datetime values like DOB. DOB values in Varchar in the format dd-mm-yyyy (if that is what you choose to turn them into) won't sort properly.
The SQL to achieve this, however not-recommended, is
Select FirstName, LastName, Age, DOB
from (
select Sorter = 1,
Convert(Varchar(max), FirstName) as FirstName,
Convert(Varchar(max), LastName) as LastName,
Convert(Varchar(max), Age) as Age,
Convert(Varchar(max), DOB, 126) as DOB
from People
union all
select 0, 'FirstName', 'LastName', 'Age', 'DOB') X
order by Sorter, FirstName -- or whatever ordering you need
The lightest-weight way to do this is probably to do a UNION:
SELECT 'FirstName' AS FirstName, 'LastName' AS LastName
UNION ALL
SELECT FirstName, LastName
FROM People
No need to create temporary tables.
The UNION All is the solution except it should be pointed out that:
To add a header to a non-char column will require converting the column in the first part of the query.
If the converted column is used as part of the sort order then the field reference will have to be to the name of the column in the query, not the table
example:
Select Convert(varchar(25), b.integerfiled) AS [Converted Field]...
... Order by [Converted Field]