SQL value at min query without join - sql

Quite often I have to solve the following problem. Suppose I have 3 columns: Col A, Col B, Col C.
Col A contain some date which I need group on. Col B contains value for which I need to find minimum for each group in Col A. And Col C contains data for which I would like to find where minimum occurred, that is pull value from Col C corresponding to min in Col B.
I commonly solve this problem, but writing JOIN statement. Does anybody know better solution, without JOIN? The problem seem so common to me that I would image it would make sense to create dedicated SQL command. It is like finding value of the function at a point of minimum (from math point of view)
Regards,
UPDATE.
I apologize for not posting an example of what I mean. Here is my table:
Location Quantity Street
New York 2 Broad
New York 3 Main
Pittsburgh 1 Grove
Pittsburgh 5 School
Austin 7 Hayes
Austin 2 Barn
I would like to group by "Location" and choose min in "Quantity" for each group. Finally I would like to find "Street" corresponding to each min value.
Here is how my final output should look like:
Location Quantity Street
Austin 2 Barn
New York 2 Broad
Pittsburgh 1 Grove
And here is how I accomplish it ( I believe it is too long and there should be dedicated SQL function for it)
SELECT TheFunction.Location, Quantity, Street
FROM [AnalysisDatabase].[dbo].[ValueAtMin] As TheFunction
inner join
(SELECT [Location], Min([Quantity]) as MinValue
FROM [AnalysisDatabase].[dbo].[ValueAtMin] Group by [Location]) as XValue
on TheFunction.Location = XValue.Location and XValue.MinValue = TheFunction.Quantity

This works with your dataset in mysql:
select location, quantity, street
from tempso c
where quantity = (select min(quantity)
from tempso b
where c.location = b.location)
order by location

Try this sql below:
Data table (using #temp)
location quantity Street
------------------------------ ----------- ------------------------------
New York 2 Broad
New York 3 Main
Pittsburgh 1 Grove
Pittsburgh 5 School
Austin 7 Hayes
Austin 2 Barn
SQL:
select location, quantity, street from #temp a
where quantity in (select min(quantity) as quantity from #temp b group by location having a.location = b.location)
order by location asc, quantity desc
Data Result:
location quantity street
------------------------------ ----------- ------------------------------
Austin 2 Barn
New York 2 Broad
Pittsburgh 1 Grove

In postgresql:
select
distinct on (a)
a,b,c
from tbl
order by a, b
You can also achieve it with a window function to figure out the correct c,
and then group by a (by putting the window query inside the grouping query)

Related

SQLite query to get table based on values of another table

I am not sure what title has to be here to correctly reflect my question, I can only describe what I want.
There is a table with fields:
id, name, city
There are next rows:
1 John London
2 Mary Paris
3 John Paris
4 Samy London
I want to get a such result:
London Paris
Total 2 2
John 1 1
Mary 0 1
Samy 1 0
So, I need to take all unique values of name and find an appropriate quantity for unique values of another field (city)
Also I want to get a total quantity of each city
Simple way to do it is:
1)Get a list of unique names
SELECT DISTINCT name FROM table
2)Get a list of unique cities
SELECT DISTINCT city FROM table
3)Create a query for every name and city
SELECT COUNT(city) FROM table WHERE name = some_name AND city = some_city
4)Get total:
SELECT COUNT(city) FROM table WHERE name = some_name
(I did't test these queries, so maybe there are some errors here but it's only to show the idea)
As there are 3 names and 2 cities -> 3 * 2 = 6 queries to DB
But for a table with 100 cities and 100 names -> 100 * 100 = 10 000 queries to DB
and it may take a lot of time to do.
Also, names and cities may be changed, so, I can't create a query with predefined names or cities as every day it's new ones, so, instead of London and Paris it may be Moscow, Turin and Berlin. The same thing with names.
How to get such table with one-two queries to original table using sqlite?
(sqlite: I do it for android)
You can get the per-name results with conditional aggregation. As for the total, unfortunately SQLite does not support the with rollup clause, that would generate it automatically.
One workaround is union all and an additional column for ordering:
select name, london, paris
from (
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
) t
order by prio, name
Actually the subquery might not be necessary:
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
order by prio, name
#GMB gave me the idea of using group by, but as I do it for SQLite on Android, so, the answer looks like:
SELECT name,
COUNT(CASE WHEN city = :london THEN 1 END) as countLondon,
COUNT(CASE WHEN city = :paris THEN 1 END) as countParis
FROM table2 GROUP BY name
where :london and :paris are passed params, and countLondon and countParis are fields of the response class

How to find people in a database who live in the same cities?

I'm new to SQL, and I'm asking for help in an apparently easy question, but it gets cumbersome in my mind.
I have the following table:
ID NAME CITY
---------------------
1 John new york
2 Sam new york
3 Tom boston
4 Bob boston
5 Jan chicago
6 Ted san francisco
7 Kat boston
I want a query that returns all the people who live in a city that another person registered in the database also lives in.
The answer, for the table I showed above, would be:
ID NAME CITY
---------------------
1 John new york
2 Sam new york
3 Tom boston
4 Bob boston
7 Kat boston
This is really a two part question:
What cities have more than one user located in them?
What users live in that subset of cities?
Let's answer it in two parts. Let's also make the simplifying assumption (not stated in your question) that the Users table has only one entry per user per city.
To find cities with more than one user:
SELECT City FROM Users GROUP BY City HAVING COUNT(*) > 1
Now, let's find all the users for those cities:
SELECT ID, User, City FROM Users
WHERE City IN (SELECT City FROM Users GROUP BY CITY HAVING COUNT(*) > 1)
I would use EXISTS :
SELECT t.*
FROM table t
WHERE EXISTS (SELECT 1 FROM table t1 WHERE t1.city = t.city AND t1.name <> t.name);
To avoid a correlated subquery which leads to a nested loop, you could perform a self join:
SELECT id, name, city
FROM persons
JOIN (SELECT city
FROM persons
GROUP BY city HAVING count(*) > 1) AS cities
USING (city);
This might be the most performant solution.
This will give you the rows that have the same city more than 1 time:
SELECT persons.*
FROM persons
WHERE (SELECT COUNT(*) FROM persons AS p GROUP BY CITY HAVING p.CITY = persons.CITY) > 1
This is just a different flavor from the others that have posted.
SELECT ID,
name,
city
FROM (SELECT DISTINCT
ID,
name,
city,
COUNT(1) OVER (PARTITION BY city) AS cityCount
FROM table) t
WHERE cityCount > 1
This can be expressed many ways. Here is one possible way:
select * from persons p
where exists (
select 1 from persons p2
where p2.city = p.city and p2.name <> p.name
)

SQL - how to transpose only some row values into column headers without pivot

I have a table similar to this:
stud_ID | first_name | last_name | email | col_num | user_value
1 tom smith 50 Retail
1 tom smith 60 Product
2 Sam wright 50 Retail
2 Sam wright 60 Sale
but need to convert it to: (basically transpose 'col_num' to column headers and change 50 to function, 60 to department)
stud_ID | first_name | last_name | email | Function | Department
1 tom smith Retail Product
2 Sam wright Retail Sale
Unfortunately Pivot doesn't work in my system, just wondering if there is any other way to do this please?
The code that I have so far (sorry for the long list):
SELECT c.person_id_external as stu_id,
c.lname,
c.fname,
c.mi,
a.cpnt_id,
a.cpnt_typ_id,
a.rev_dte,
a.rev_num,
cp.cpnt_title AS cpnt_desc,
a.compl_dte,
a.CMPL_STAT_ID,
b.cmpl_stat_desc,
b.PROVIDE_CRDT,
b.INITIATE_LEVEL1_SURVEY,
b.INITIATE_LEVEL3_SURVEY,
a.SCHD_ID,
a.TOTAL_HRS,
a.CREDIT_HRS,
a.CPE_HRS,
a.CONTACT_HRS,
a.TUITION,
a.INST_NAME,
--a.COMMENTS,
a.BASE_STUD_ID,
a.BASE_CPNT_TYP_ID,
a.BASE_CPNT_ID,
a.BASE_REV_DTE,
a.BASE_CMPL_STAT_ID,
a.BASE_COMPL_DTE,
a.ES_USER_NAME,
a.INTERNAL,
a.GRADE_OPT,
a.GRADE,
a.PMT_ORDER_TICKET_NO,
a.TICKET_SEQUENCE,
a.ORDER_ITEM_ID,
a.ESIG_MESSAGE,
a.ESIG_MEANING_CODE_ID,
a.ESIG_MEANING_CODE_DESC,
a.CPNT_KEY,
a.CURRENCY_CODE,
c.EMP_STAT_ID,
c.EMP_TYP_ID,
c.JL_ID,
c.JP_ID,
c.TARGET_JP_ID,
c.JOB_TITLE,
c.DMN_ID,
c.ORG_ID,
c.REGION_ID,
c.CO_ID,
c.NOTACTIVE,
c.ADDR,
c.CITY,
c.STATE,
c.POSTAL,
c.CNTRY,
c.SUPER,
c.COACH_STUD_ID,
c.HIRE_DTE,
c.TERM_DTE,
c.EMAIL_ADDR,
c.RESUME_LOCN,
c.COMMENTS,
c.SHIPPING_NAME,
c.SHIPPING_CONTACT_NAME,
c.SHIPPING_ADDR,
c.SHIPPING_ADDR1,
c.SHIPPING_CITY,
c.SHIPPING_STATE,
c.SHIPPING_POSTAL,
c.SHIPPING_CNTRY,
c.SHIPPING_PHON_NUM,
c.SHIPPING_FAX_NUM,
c.SHIPPING_EMAIL_ADDR,
c.STUD_PSWD,
c.PIN,
c.PIN_DATE,
c.ENCRYPTED,
c.HAS_ACCESS,
c.BILLING_NAME,
c.BILLING_CONTACT_NAME,
c.BILLING_ADDR,
c.BILLING_ADDR1,
c.BILLING_CITY,
c.BILLING_STATE,
c.BILLING_POSTAL,
c.BILLING_CNTRY,
c.BILLING_PHON_NUM,
c.BILLING_FAX_NUM,
c.BILLING_EMAIL_ADDR,
c.SELF_REGISTRATION,
c.SELF_REGISTRATION_DATE,
c.ACCESS_TO_ORG_FIN_ACT,
c.NOTIFY_DEV_PLAN_ITEM_ADD,
c.NOTIFY_DEV_PLAN_ITEM_MOD,
c.NOTIFY_DEV_PLAN_ITEM_REMOVE,
c.NOTIFY_WHEN_SUB_ITEM_COMPLETE,
c.NOTIFY_WHEN_SUB_ITEM_FAILURE,
c.LOCKED,
c.PASSWORD_EXP_DATE,
c.SECURITY_QUESTION,
c.SECURITY_ANSWER,
c.ROLE_ID,
c.IMAGE_ID,
c.GENDER,
c.PAST_SERVICE,
c.LST_UNLOCK_TSTMP,
c.MANAGE_SUB_SP,
c.MANAGE_OWN_SP,
d.col_num,
d.user_value
FROM pa_cpnt_evthst a,
pa_cmpl_stat b,
pa_student c,
pv_course cp,
pa_stud_user d
WHERE a.cmpl_stat_id = b.cmpl_stat_id
AND a.stud_id = c.stud_id
AND cp.cpnt_typ_id(+) = a.cpnt_typ_id
AND cp.cpnt_id(+) = a.cpnt_id
AND cp.rev_dte(+) = a.rev_dte
AND a.CPNT_TYP_ID != 'SYSTEM_PROGRAM_ENTITY'
AND c.stud_id = d.stud_id
AND d.col_num in ('10','30','50','60')
I would just use conditional aggregation:
select stud_ID, first_name, last_name, email,
max(case when col_num = 50 then user_value end) as function,
max(case when col_num = 60 then user_value end) as department
from t
group by stud_ID, first_name, last_name, email;
Your code seems to have nothing to do with the sample data. I do notice however that you are using implicit join syntax. You really need to learn how to use proper, explicit, standard JOIN syntax.
I'm assuming you have Sql Server 2000 or 2003. What you need to do in that case is create a script with one cursor.
This cursor will create a text with something like this:
string var = "CREATE TABLE #Report (Col1 VARCHAR(20), Col2, VARCHAR(20), " + ColumnName
That way you can create a temp table on the fly, at the end you will need to do a Select of your temp table to get your pivot table ready.
Its not that easy if you are not familiar with cursors.
OR
if there are only few values on your 'pivot' column and they are not going to grow you can also do something like this:
Pivot using SQL Server 2000
I'm unable to understand your code, so I'll just assume the table mentioned in the sample data as stud(because of stud_id).
So here is what I think can do the work of pivot.
SELECT ISNULL(s1.stud_ID, s2.stud_id),
ISNULL(s1.first_name, s2.first_name),
ISNULL(s1.last_name, s2.last_name),
ISNULL(s1.email, s2.email),
s1.user_value as [Function], s2.user_value as Department
FROM stud s1 OUTER JOIN stud s2
ON s1.stud_ID = s2.stud_ID -- Assuming stud_ID is primary key, else join on all primary keys
AND s1.col_num = 50 AND s2.col_num = 60
Explanation: I'm just trying to simulate here what PIVOT does. For every column you want, you create a new table in the JOIN and constaint it to only one value in your col_num column. For example, if there are no values for 50 in s1, the OUTER JOIN will get make it NULL and we need to pull records from s2.
Note: If you need more than 2 new columns, then you can use COALESCE instead of ISNULL

SQL: Selecting the lowest value row

I am looking at hospital claims data and there are multiple rows with the same admission date. I only want one admission date per patient. If there are multiple rows with the same admission date, I want to select the row with the largest LOS, or when LOS are equal, I want to select the one with the oldest admission date. For example, given the following data:
ID ADMIT DC LOS CLMID
-- ----- -- --- -----
1 1-1-07 1-1-07 0 XXX
1 1-2-07 1-2-07 0 XXX
2 1-5-07 1-10-07 5 YYY
3 2-8-07 2-8-07 0 ZZZ
3 2-8-07 2-12-07 4 ZZZ
3 2-8-07 2-10-07 2 ZZZ
I would want to select:
ID ADMIT DC LOS CLMID
-- ----- -- --- -----
1 1-1-07 1-1-07 0 XXX
2 1-5-07 1-10-07 5 YYY
3 2-8-07 2-12-07 4 ZZZ
I've tried using the MIN aggregrate function, but I'm pretty lost on how to get where I want. I'm new to SQL and would appreciate any help!
So far, this is my best shot:
SELECT DISTINCT
ID, ADMIT, DC, LOS, CLMID, MIN(ADMIT)
FROM
TABLE1
GROUP BY
ID, ADMIT, DC, LOS, CLMID
ORDER BY
ID
I've also tried just selecting just the maximum LOS instead of the minimum admit, but that doesn't do it either.
Thanks :)
This is a prioritization and you can solve these problems with row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by admit asc, los desc) as seqnum
from table1 t
) t
where seqnum = 1;
A couple of notes:
I assume that dates are actually stored as date/times in the database and not as strings.
The conditions in your first paragraph are a bit vague. This gets the one row for each patient withi the highest los on the earliest admit date.
In MySQL this would be something like:
Select distinct ID, * from admissions_table order by DC DESC group by ID

Fill Users table with data using percentages from another table

I have a Table Users (it has millions of rows)
Id Name Country Product
+----+---------------+---------------+--------------+
1 John Canada
2 Kate Argentina
3 Mark China
4 Max Canada
5 Sam Argentina
6 Stacy China
...
1000 Ken Canada
I want to fill the Product column with A, B or C based on percentages.
I have another table called CountriesStats like the following
Id Country A B C
+-----+---------------+--------------+-------------+----------+
1 Canada 60 20 20
2 Argentina 35 45 20
3 China 40 10 50
This table holds the percentage of people with each product. For example in Canada 60% of people have product A, 20% have product B and 20% have product C.
I would like to fill the Users table with data based on the Percentages in the second data. So for example if there are 1 million user in canada, I would like to fill 600000 of the Product column in the Users table with A 200000 with B and 200000 with C
Thanks for any help on how to do that. I do not mind doing it in multiple steps I jsut need hints on how can I achieve that in SQL
The logic behind this is not too difficult. Assign a sequential counter to each person in each country. Then, using this value, assign the correct product based on this value. For instance, in your example, when the number is less than or equal to 600,000 then 'A' gets assigned. For 600,001 to 800,000 then 'B', and finally 'C' to the rest.
The following SQL accomplishes this:
with toupdate as (
select u.*,
row_number() over (partition by country order by newid()) as seqnum,
count(*) over (partition by country) as tot
from users u
)
update u
set product = (case when seqnum <= tot * A / 100 then 'A'
when seqnum <= tot * (A + B) / 100 then 'B'
else 'C'
end)
from toupdate u join
CountriesStats cs
on u.country = cs.country;
The with statement defines an updatable subquery with the sequence number and total for each each country, on each row. This is a nice feature of SQL Server, but is not supported in all databases.
The from statement is joining back to the CountriesStats table to get the needed values for each country. And the case statement does the necessary logic.
Note that the sequential number is assigned randomly, using newid(), so the products should be assigned randomly through the initial table.