Select MAX Value from pivot - sql

I am working on an application where the user enters their name, age, state of residence. We want to capture the state to use for various things. Very straight forward when only one name, age, state of residence is listed. In some cases the user may list more than one name, age, state of residence. All of the data is stored in one table called Table. There is an ID, Name, and Value Column. I am attempting to pivot the data and then select the State of residence I need. There are two scenarios I need to account for.
-There are multiple names, ages, states of residence entered on the application. The state of residence we want is the one with the highest age. i.e.
Name Age State of Residence
John 25 FL
Bill 31 AL
Sue 26 MS
In scenario 1 I want to return AL as the State of Residence
-There are multiple names, ages, states of residence entered on the application. If the ages are the same in multiple states of residence, return the state that falls first alphabetically.
Name Age State of Residence
John 25 FL
Will 25 CA
Sue 26 MS
In scenario 2 I want to return CA as the State of Residence
I have tried:
Select State, Age
FROM (
Select * from Table
where ID = #ID and Name IN (State,Age)
) as s
PIVOT
(
MAX(value) FOR Name IN (State,Age)
) as pvt
This returns the data, I just lack the knowledge on how to get the rest of the way. I tried adding group by but it would not work with the PIVOT.
I also tried it this way:
DECLARE #State AS NVARCHAR(2)
SELECT
[State],[ [Acres]
INTO #tmpTable FROM Table
PIVOT
(
MAX(Value)
FOR [Name]
IN ([State],[Acres])AS p
WHERE ID = #ID
--
SELECT * FROM #tmpTable
--
DROP TABLE #tmpTable
Edited/Updated information:
I have one table called dbo.Table with ID,NAME,VALUE,INSTANCE columns. In the first scenario, the table looks like this after the user enters two different records:
ID NAME VALUE INSTANCE
1000 FIRST NAME JOHN 1
1000 AGE 25 1
1000 STATE AZ 1
1000 FIRST NAME BILL 2
1000 AGE 27 2
1000 STATE NH 2
I want to return NH as the State of Residence becuase it has the higest age.
In the second scenario the table looks like:
ID NAME VALUE INSTANCE
1000 FIRST NAME JOHN 1
1000 AGE 25 1
1000 STATE AZ 1
1000 FIRST NAME BILL 2
1000 AGE 25 2
1000 STATE NH 2
I want to return AZ as the State of Residence because it falls first alphabetically
I am accounting for both scenarios. Neither is guaranteed to occur. The user may only enter one record. Either way a State of Residence is always returned. I hope that makes more sense. I attempted to pivot the data as you see it above, but do not understand how to get the state i need. Thanks.

Edited
Ok. I got the point. You're suffering from a bad designed table. It is not at least in 1FN where each table column (attribute) cannot store more than a single value from a single domain.
In your case you have a widespread table, accepting any kind of attribute you want. The best option to you is fix your columns in a common 1FN table, like:
CREATE TABLE dbo.table (
instance INTEGER ... ,
first_name VARCHAR... ,
age integer... ,
state CHAR(2)...
)
Anyways, if you're working with legacy applications that you cannot modify, try this out:
-- Create a database VIEW to simulate a 1FN table
CREATE VIEW dbo.view1 AS
SELECT DISTINCT
t.INSTANCE,
(
SELECT
t2.value
FROM
dbo.table t2
WHERE
t2.id = t.id
AND t2.instance = t.instance
AND t2.name = 'FIRST NAME'
) AS FIRST_NAME,
(
SELECT
t2.value
FROM
dbo.table t2
WHERE
t2.id = t.id
AND t2.instance = t.instance
AND t2.name = 'AGE'
) AS AGE,
(
SELECT
t2.value
FROM
dbo.table t2
WHERE
t2.id = t.id
AND t2.instance = t.instance
AND t2.name = 'STATE'
) AS STATE
FROM
dbo.table t
Then...
Question 1 - although all fields are selected you can fetch only state if you want
SELECT
*
FROM
dbo.view1 v
WHERE
v.age = (
SELECT
MAX(v2.age)
FROM
dbo.view1 v2
)
Question 2 - First state according to alphabetical order ascending
SELECT
MIN(v.state) AS state
FROM
dbo.view1 v
WHERE
-- you can define a WHERE condition to meet your requirements here
v.age = 25
If you like it, please upvote this answer or mark it as acceptable :)

Related

SQLite query to get table based on values of another table

I am not sure what title has to be here to correctly reflect my question, I can only describe what I want.
There is a table with fields:
id, name, city
There are next rows:
1 John London
2 Mary Paris
3 John Paris
4 Samy London
I want to get a such result:
London Paris
Total 2 2
John 1 1
Mary 0 1
Samy 1 0
So, I need to take all unique values of name and find an appropriate quantity for unique values of another field (city)
Also I want to get a total quantity of each city
Simple way to do it is:
1)Get a list of unique names
SELECT DISTINCT name FROM table
2)Get a list of unique cities
SELECT DISTINCT city FROM table
3)Create a query for every name and city
SELECT COUNT(city) FROM table WHERE name = some_name AND city = some_city
4)Get total:
SELECT COUNT(city) FROM table WHERE name = some_name
(I did't test these queries, so maybe there are some errors here but it's only to show the idea)
As there are 3 names and 2 cities -> 3 * 2 = 6 queries to DB
But for a table with 100 cities and 100 names -> 100 * 100 = 10 000 queries to DB
and it may take a lot of time to do.
Also, names and cities may be changed, so, I can't create a query with predefined names or cities as every day it's new ones, so, instead of London and Paris it may be Moscow, Turin and Berlin. The same thing with names.
How to get such table with one-two queries to original table using sqlite?
(sqlite: I do it for android)
You can get the per-name results with conditional aggregation. As for the total, unfortunately SQLite does not support the with rollup clause, that would generate it automatically.
One workaround is union all and an additional column for ordering:
select name, london, paris
from (
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
) t
order by prio, name
Actually the subquery might not be necessary:
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
order by prio, name
#GMB gave me the idea of using group by, but as I do it for SQLite on Android, so, the answer looks like:
SELECT name,
COUNT(CASE WHEN city = :london THEN 1 END) as countLondon,
COUNT(CASE WHEN city = :paris THEN 1 END) as countParis
FROM table2 GROUP BY name
where :london and :paris are passed params, and countLondon and countParis are fields of the response class

Set Duplicate Values to Null in PostgresSQL retaining one of the values

I have a database like this:
id name email
0 Bill bill#fakeemail.com
1 John john#fakeemail.com
2 Susan susan#fakeemail.com
3 Susan J susan#fakeemail.com
I want to remove duplicate emails by setting the value to null, but retain at least 1 email on one of the rows (doesn't really matter which one).
So that the resulting database would look like this:
id name email
0 Bill bill#fakeemail.com
1 John john#fakeemail.com
2 Susan susan#fakeemail.com
3 Susan J
I was able to target the rows like this
SELECT COUNT(email) as count FROM users WHERE count > 1
But can't figure out how to set the value to null while still retaining at least 1.
Update the rows which have the same email but greater id:
update my_table t1
set email = null
where exists (
select from my_table t2
where t1.email = t2.email and t1.id > t2.id
);
Working example in rextester.
You can use a windowed partition to assign a row number to each email group, and then use that generated row number to modify all rows except for one. Something like this:
WITH annotated_persons AS(
SELECT
id,
name,
email,
ROW_NUMBER () OVER (PARTITION BY email) AS i
FROM
persons;
)
UPDATE persons
SET email = null
WHERE id = annotated_persons.id AND annotated_persons.i <> 1
You may have to use another subquery in order to gather the IDs of persons whose row number != 1, and then change your update query to
WHERE id IN person_ids
It's been awhile since I've used a window.

SQL: Tree structure without parent key

Note: The Data schema can not be changed. I'm stuck with it.
Database: SQLite
I have a simple tree structure, without parent keys, that is only 1 level deep. I have simplied the data for clarity:
ID Content Title
1 Null Canada
2 25 Toronto
3 33 Vancouver
4 Null USA
5 45 New York
6 56 Dallas
The structure is ordinal as well so all Canadian Cities are > Canada's ID of 1 and less than the USA's ID of 4
Question: How do I select all a nation's Cities when I do not know how many there are?
My query assigns every city to every country, which is probably not what you want, but:
http://sqlfiddle.com/#!5/94d63/3
SELECT *
FROM (
SELECT
place.Title AS country_name,
place.ID AS id,
(SELECT MIN(ID)
FROM place AS next_place
WHERE next_place.ID > place.ID
AND next_place.Content IS NULL
) AS next_id
FROM place
WHERE place.Content IS NULL
) AS country
INNER JOIN place
ON place.ID > country.id
AND CASE WHEN country.next_id IS NOT NULL
THEN place.ID < country.next_id
ELSE 1 END
select * from tbl
where id > 1
and id < (select min(id) from tbl where content is null and id > 1)
EDIT
I just realized the above does not work if there are no countries with greater ID. This should fix it.
select * from tbl a
where id > 4
and id < (select coalesce(b.id,a.id+1) from tbl b where b.content is null and b.id > a.id)
Edit 2 - Also made subquery fully correlated, so only have to change country id in one place.
You have here severals things to consider, one is if your data is gonna change and the other one is if it isn't gonna change, for the first one exist 2 solutions, and for the second, just one.
If your data is organize as shown in your example, you can do a select top 3, i.e.
SELECT * FROM CITIES WHERE ID NOT IN (SELECT TOP 3 ID FROM CITIES)
You can create another table where you specify wich city belongs to what parent, and make the hierarchy by yourself.
I reccomend the second one to be used.

Sql COALESCE entire rows?

I just learned about COALESCE and I'm wondering if it's possible to COALESCE an entire row of data between two tables? If not, what's the best approach to the following ramblings?
For instance, I have these two tables and assuming that all columns match:
tbl_Employees
Id Name Email Etc
-----------------------------------
1 Sue ... ...
2 Rick ... ...
tbl_Customers
Id Name Email Etc
-----------------------------------
1 Bob ... ...
2 Dan ... ...
3 Mary ... ...
And a table with id's:
tbl_PeopleInCompany
Id CompanyId
-----------------
1 1
2 1
3 1
And I want to query the data in a way that gets rows from the first table with matching id's, but gets from second table if no id is found.
So the resulting query would look like:
Id Name Email Etc
-----------------------------------
1 Sue ... ...
2 Rick ... ...
3 Mary ... ...
Where Sue and Rick was taken from the first table, and Mary from the second.
SELECT Id, Name, Email, Etc FROM tbl_Employees
WHERE Id IN (SELECT ID From tbl_PeopleInID)
UNION ALL
SELECT Id, Name, Email, Etc FROM tbl_Customers
WHERE Id IN (SELECT ID From tbl_PeopleInID) AND
Id NOT IN (SELECT Id FROM tbl_Employees)
Depending on the number of rows, there are several different ways to write these queries (with JOIN and EXISTS), but try this first.
This query first selects all the people from tbl_Employees that have an Id value in your target list (the table tbl_PeopleInID). It then adds to the "bottom" of this bunch of rows the results of the second query. The second query gets all tbl_Customer rows with Ids in your target list but excluding any with Ids that appear in tbl_Employees.
The total list contains the people you want — all Ids from tbl_PeopleInID with preference given to Employees but missing records pulled from Customers.
You can also do this:
1) Outer Join the two tables on tbl_Employees.Id = tbl_Customers.Id. This will give you all the rows from tbl_Employees and leave the tbl_Customers columns null if there is no matching row.
2) Use CASE WHEN to select either the tbl_Employees column or tbl_Customers column, based on whether tbl_Customers.Id IS NULL, like this:
CASE WHEN tbl_Customers.Id IS NULL THEN tbl_Employees.Name ELSE tbl_Customers.Name END AS Name
(My syntax might not be perfect there, but the technique is sound).
This should be pretty performant. It uses a CTE to basically build a small table of Customers that have no matching Employee records, and then it simply UNIONs that result with the Employee records
;WITH FilteredCustomers (Id, Name, Email, Etc)
AS
(
SELECT Id, Name, Email, Etc
FROM tbl_Customers C
INNER JOIN tbl_PeopleInCompany PIC
ON C.Id = PIC.Id
LEFT JOIN tbl_Employees E
ON C.Id = E.Id
WHERE E.Id IS NULL
)
SELECT Id, Name, Email, Etc
FROM tbl_Employees E
INNER JOIN tbl_PeopleInCompany PIC
ON C.Id = PIC.Id
UNION
SELECT Id, Name, Email, Etc
FROM FilteredCustomers
Using the IN Operator can be rather taxing on large queries as it might have to evaluate the subquery for each record being processed.
I don't think the COALESCE function can be used for what you're thinking. COALESCE is similar to ISNULL, except it allows you to pass in multiple columns, and will return the first non-null value:
SELECT Name, Class, Color, ProductNumber,
COALESCE(Class, Color, ProductNumber) AS FirstNotNull
FROM Production.Product
This article should explain it's application:
http://msdn.microsoft.com/en-us/library/ms190349.aspx
It sounds like Larry Lustig's answer is more along the lines of what you need though.

Using sql to keep only a single record where both name field and address field repeat in 5+ records

I am trying to delete all but one record from table where name field repeats same value more than 5 times and the address field repeats more than five times for a table. So if there are 5 records with a name field and address field that are the same for all 5, then I would like to delete 4 out of 5. An example:
id name address
1 john 6440
2 john 6440
3 john 6440
4 john 6440
5 john 6440
I would only want to return 1 record from the 5 records above.
I'm still having problems with this.
1) I create a table called KeepThese and give it a primary key id.
2) I create a query called delete_1 and copy this into it:
INSERT INTO KeepThese
SELECT ID FROM
(
SELECT Min(ID) AS ID
FROM Print_Ready
GROUP BY names_1, addresses
HAVING COUNT(*) >=5
UNION ALL
SELECT ID FROM Print_Ready as P
INNER JOIN
(SELECT Names_1, addresses
FROM Print_ready
GROUP BY Names_1, addresses
HAVING COUNT(*) < 5) as ThoseLessThan5
ON ThoseLessThan5.Names_1 = P.Names_1
AND ThoseLessThan5.addresses = P.addresses
)
3) I create a query called delete_2 and copy this into it:
DELETE P.* FROM Print_Ready as P
LEFT JOIN KeepThese as K
ON K.ID = P.ID
WHERE K.ID IS NULL
4) Then I run delete_1. I get a message that says "circular reference caused by alias ID" So I change this piece:
FROM (SELECT Min(ID) AS ID
to say this:
FROM (SELECT Min(ID) AS ID2
Then I double click again and a popup displays saying Enter Parameter Value for ID.This indicates that it doesn't know what ID is. But print_ready is only a query and while it has an id, it is in reality the id of another table that got filtered into this query.
Not sure what to do at this point.
CREATE TABLE isolate_duplicates AS dont sure it work for access, beside you should give a name for count(*) for new table.
This maybe work:
SELECT DISTINCT name, address
INTO isolate_duplicate
FROM print_ready
GROUP BY name + address
HAVING COUNT(*) > 4
DELETE print_ready
WHERE name + address
IN (SELECT name + address
FROM isolate_duplicate)
INSERT print_ready
SELECT *
FROM isolate_duplicate
DROP TABLE isolate_duplicate
Not tested.