Getting sub data from a list of facilities - sql

I am trying to write a query and would like some help if possible. Thanks in advance.
I have a table of facility data (~100k rows) that I am getting from a public source. That data contains several records for what I would consider to be the same place (same name, city, state), they just have different suite numbers. The other interesting bit of code is that I have a selection counter on the data that I increment anytime someone chooses one of the facilities. This way, I can use the selection count along with some other weight calculations to make results show higher in a list.
What I am trying to do is write a query that when someone enters a search query, it will show only one record for the facility, the one with the highest selection count, and omit the rest.
Note: I do not want to do any preprocessing to the data as it is going to get re-loaded monthly.
Scheama:
ID
Name
Address 1
Address 2
City
State
Zip
Phone
Selection Count
Example Search: "women"
ID Name City State Selection Count
1 Brigham & Women's Hospital Boston MA 22
2 Brigham & Women's Hospital Cambridge MA 0
3 Brigham & Women's Hospital Boston MA 5
4 Brigham & Women's Hospital Boston MA 1
5 Brigham & Women's Hospital Orlando FL 3
6 Woman's Hospital of Detroit Detroit MI 100
7 Brigham & Women's Hospital Boston MA 0
8 Woman's Hospital of Detroit Detroit MI 55
What I'd like is a resultset that contains 1, 2, 5, 6
1,3,4,7 Are the same so bring back the top selection count. Same for 6 and 8.
I am sure that there is a having and a top clause in here somewhere, but I have not been able to get this to do what I want.
Thoughts?

How about
select id, name, city, state, selcount from t
where exists
(
select 1 from
(select name, city, state, max(selcount) selcount
from t
group by name, city, state) s
where s.name = t.name and s.city = t.city and s.state = t.state and s.selcount = t.selcount
)
I've built a SQL Fiddle for this to show a working example.

WITH cteRowNum AS (
SELECT ID, Name, City, State, [Selection Count],
ROW_NUMBER() OVER(PARTITION BY Name, City, State ORDER BY [Selection Count] DESC) AS RowNum
FROM YourTable
)
SELECT ID, Name, City, State, [Selection Count]
FROM cteRowNum
WHERE RowNum = 1;

Related

How do I select all values which satisfy certain aggregate properties?

Say, I have a table (named "Customers") which consists of:
CustomerName
Country
City
I am trying to list the names of all customers from cities where there are at least two customers.
This is my initial attempt:
SELECT CustomerName, City
FROM Customers
GROUP BY City
HAVING COUNT(City) > 1
This is the result that I got:
CustomerName
City
Person A
New York
Person C
Los Angeles
Here, Person A is a person from NY who appears on the top of the table and similar for Person B. However, what I wanted was the listing of all customers from New York and LA.
When I tried:
SELECT COUNT(CustomerName), City
FROM Customers
GROUP BY City
HAVING COUNT(City) > 1
I had
COUNT(CustomerName)
City
3
New York
5
Los Angeles
This means that the code is working properly, except that my original code only displays a person on top of the table from NY and LA. How do I resolve this issue?
Get cities with more than 1 customer in a subquery, and use that list to select the customers:
SELECT cst.CustomerName
FROM Customers cst
WHERE Cst.City in (
-- All cities where there are at least two customers
SELECT CstGE2.City
FROM Customers CstGE2
GROUP BY CstGE2.City
HAVING count(*) >= 2
)
How about this? I've taken the city out of the select part since you said you just wanted customer names.
SELECT a.CustomerName
FROM Customers a
WHERE (
SELECT COUNT(b.CustomerName)
FROM Customers b
WHERE b.City = a.City
) > 1
ORDER BY a.CustomerName

SQLite query to get table based on values of another table

I am not sure what title has to be here to correctly reflect my question, I can only describe what I want.
There is a table with fields:
id, name, city
There are next rows:
1 John London
2 Mary Paris
3 John Paris
4 Samy London
I want to get a such result:
London Paris
Total 2 2
John 1 1
Mary 0 1
Samy 1 0
So, I need to take all unique values of name and find an appropriate quantity for unique values of another field (city)
Also I want to get a total quantity of each city
Simple way to do it is:
1)Get a list of unique names
SELECT DISTINCT name FROM table
2)Get a list of unique cities
SELECT DISTINCT city FROM table
3)Create a query for every name and city
SELECT COUNT(city) FROM table WHERE name = some_name AND city = some_city
4)Get total:
SELECT COUNT(city) FROM table WHERE name = some_name
(I did't test these queries, so maybe there are some errors here but it's only to show the idea)
As there are 3 names and 2 cities -> 3 * 2 = 6 queries to DB
But for a table with 100 cities and 100 names -> 100 * 100 = 10 000 queries to DB
and it may take a lot of time to do.
Also, names and cities may be changed, so, I can't create a query with predefined names or cities as every day it's new ones, so, instead of London and Paris it may be Moscow, Turin and Berlin. The same thing with names.
How to get such table with one-two queries to original table using sqlite?
(sqlite: I do it for android)
You can get the per-name results with conditional aggregation. As for the total, unfortunately SQLite does not support the with rollup clause, that would generate it automatically.
One workaround is union all and an additional column for ordering:
select name, london, paris
from (
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
) t
order by prio, name
Actually the subquery might not be necessary:
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
order by prio, name
#GMB gave me the idea of using group by, but as I do it for SQLite on Android, so, the answer looks like:
SELECT name,
COUNT(CASE WHEN city = :london THEN 1 END) as countLondon,
COUNT(CASE WHEN city = :paris THEN 1 END) as countParis
FROM table2 GROUP BY name
where :london and :paris are passed params, and countLondon and countParis are fields of the response class

How to find people in a database who live in the same cities?

I'm new to SQL, and I'm asking for help in an apparently easy question, but it gets cumbersome in my mind.
I have the following table:
ID NAME CITY
---------------------
1 John new york
2 Sam new york
3 Tom boston
4 Bob boston
5 Jan chicago
6 Ted san francisco
7 Kat boston
I want a query that returns all the people who live in a city that another person registered in the database also lives in.
The answer, for the table I showed above, would be:
ID NAME CITY
---------------------
1 John new york
2 Sam new york
3 Tom boston
4 Bob boston
7 Kat boston
This is really a two part question:
What cities have more than one user located in them?
What users live in that subset of cities?
Let's answer it in two parts. Let's also make the simplifying assumption (not stated in your question) that the Users table has only one entry per user per city.
To find cities with more than one user:
SELECT City FROM Users GROUP BY City HAVING COUNT(*) > 1
Now, let's find all the users for those cities:
SELECT ID, User, City FROM Users
WHERE City IN (SELECT City FROM Users GROUP BY CITY HAVING COUNT(*) > 1)
I would use EXISTS :
SELECT t.*
FROM table t
WHERE EXISTS (SELECT 1 FROM table t1 WHERE t1.city = t.city AND t1.name <> t.name);
To avoid a correlated subquery which leads to a nested loop, you could perform a self join:
SELECT id, name, city
FROM persons
JOIN (SELECT city
FROM persons
GROUP BY city HAVING count(*) > 1) AS cities
USING (city);
This might be the most performant solution.
This will give you the rows that have the same city more than 1 time:
SELECT persons.*
FROM persons
WHERE (SELECT COUNT(*) FROM persons AS p GROUP BY CITY HAVING p.CITY = persons.CITY) > 1
This is just a different flavor from the others that have posted.
SELECT ID,
name,
city
FROM (SELECT DISTINCT
ID,
name,
city,
COUNT(1) OVER (PARTITION BY city) AS cityCount
FROM table) t
WHERE cityCount > 1
This can be expressed many ways. Here is one possible way:
select * from persons p
where exists (
select 1 from persons p2
where p2.city = p.city and p2.name <> p.name
)

SQL value at min query without join

Quite often I have to solve the following problem. Suppose I have 3 columns: Col A, Col B, Col C.
Col A contain some date which I need group on. Col B contains value for which I need to find minimum for each group in Col A. And Col C contains data for which I would like to find where minimum occurred, that is pull value from Col C corresponding to min in Col B.
I commonly solve this problem, but writing JOIN statement. Does anybody know better solution, without JOIN? The problem seem so common to me that I would image it would make sense to create dedicated SQL command. It is like finding value of the function at a point of minimum (from math point of view)
Regards,
UPDATE.
I apologize for not posting an example of what I mean. Here is my table:
Location Quantity Street
New York 2 Broad
New York 3 Main
Pittsburgh 1 Grove
Pittsburgh 5 School
Austin 7 Hayes
Austin 2 Barn
I would like to group by "Location" and choose min in "Quantity" for each group. Finally I would like to find "Street" corresponding to each min value.
Here is how my final output should look like:
Location Quantity Street
Austin 2 Barn
New York 2 Broad
Pittsburgh 1 Grove
And here is how I accomplish it ( I believe it is too long and there should be dedicated SQL function for it)
SELECT TheFunction.Location, Quantity, Street
FROM [AnalysisDatabase].[dbo].[ValueAtMin] As TheFunction
inner join
(SELECT [Location], Min([Quantity]) as MinValue
FROM [AnalysisDatabase].[dbo].[ValueAtMin] Group by [Location]) as XValue
on TheFunction.Location = XValue.Location and XValue.MinValue = TheFunction.Quantity
This works with your dataset in mysql:
select location, quantity, street
from tempso c
where quantity = (select min(quantity)
from tempso b
where c.location = b.location)
order by location
Try this sql below:
Data table (using #temp)
location quantity Street
------------------------------ ----------- ------------------------------
New York 2 Broad
New York 3 Main
Pittsburgh 1 Grove
Pittsburgh 5 School
Austin 7 Hayes
Austin 2 Barn
SQL:
select location, quantity, street from #temp a
where quantity in (select min(quantity) as quantity from #temp b group by location having a.location = b.location)
order by location asc, quantity desc
Data Result:
location quantity street
------------------------------ ----------- ------------------------------
Austin 2 Barn
New York 2 Broad
Pittsburgh 1 Grove
In postgresql:
select
distinct on (a)
a,b,c
from tbl
order by a, b
You can also achieve it with a window function to figure out the correct c,
and then group by a (by putting the window query inside the grouping query)

Fill Users table with data using percentages from another table

I have a Table Users (it has millions of rows)
Id Name Country Product
+----+---------------+---------------+--------------+
1 John Canada
2 Kate Argentina
3 Mark China
4 Max Canada
5 Sam Argentina
6 Stacy China
...
1000 Ken Canada
I want to fill the Product column with A, B or C based on percentages.
I have another table called CountriesStats like the following
Id Country A B C
+-----+---------------+--------------+-------------+----------+
1 Canada 60 20 20
2 Argentina 35 45 20
3 China 40 10 50
This table holds the percentage of people with each product. For example in Canada 60% of people have product A, 20% have product B and 20% have product C.
I would like to fill the Users table with data based on the Percentages in the second data. So for example if there are 1 million user in canada, I would like to fill 600000 of the Product column in the Users table with A 200000 with B and 200000 with C
Thanks for any help on how to do that. I do not mind doing it in multiple steps I jsut need hints on how can I achieve that in SQL
The logic behind this is not too difficult. Assign a sequential counter to each person in each country. Then, using this value, assign the correct product based on this value. For instance, in your example, when the number is less than or equal to 600,000 then 'A' gets assigned. For 600,001 to 800,000 then 'B', and finally 'C' to the rest.
The following SQL accomplishes this:
with toupdate as (
select u.*,
row_number() over (partition by country order by newid()) as seqnum,
count(*) over (partition by country) as tot
from users u
)
update u
set product = (case when seqnum <= tot * A / 100 then 'A'
when seqnum <= tot * (A + B) / 100 then 'B'
else 'C'
end)
from toupdate u join
CountriesStats cs
on u.country = cs.country;
The with statement defines an updatable subquery with the sequence number and total for each each country, on each row. This is a nice feature of SQL Server, but is not supported in all databases.
The from statement is joining back to the CountriesStats table to get the needed values for each country. And the case statement does the necessary logic.
Note that the sequential number is assigned randomly, using newid(), so the products should be assigned randomly through the initial table.