How to do text calculations using columns in SQL - sql

I'd like to conditionally include the values from particular columns in a SQL query. To illustrate the question let me use the fictional example of returning selective data about users.
If we were to return all data we'd use a query of the form:
SELECT
name, email, gender
FROM
users
Assume all users have entered their gender. However some have set the value of gender_public to be false and I would like to reflect this in the data the query returns.
The type of thing I'd like to do is:
SELECT
name, email, if(gender_public, gender, 'N/A')
FROM
users
Is this possible? I'm working with Postgres

Using CASE
This is the obvious way to go, as suggested by #a_horse_with_no_name.
Assuming gender_public is of type BOOLEAN.
SELECT name, email, CASE WHEN gender_public THEN gender ELSE 'N/A' END AS gender
FROM users;
Using lateral join
This way of doing is especially useful if you need to use the value in more than one place and want to remain consistent.
SELECT name, email, T.gender
FROM users
CROSS JOIN LATERAL (
SELECT gender WHERE show_gender
UNION
SELECT 'N/A' WHERE NOT(show_gender)
) T(gender)

Related

How can I select all fields except for those with non-distinct values?

I have a table which represents data for people that have applied. Each person has one PERSON_ID, but can have multiple APP_IDs. I want to select all of the columns except for APP_ID(because its values aren't distinct) for all of the distinct people in the table.
I can list every field individually in both the select and group by clause
This works:
select PERSON_ID, FIRST,LAST,MIDDLE,BIRTHDATE,SEX,EMAIL,PRIMARY_PHONE from
applications
where first = 'Rob' and last='Robot'
group by PERSON_ID,FIRST,LAST,MIDDLE,BIRTHDATE,SEX,EMAIL,PRIMARY_PHONE
But there are twenty more fields that I may or may not use at any given time
Is there any shorter way to achieve this sort of selection without being so verbose?
select distinct is shorter:
select distinct PERSON_ID, FIRST, LAST, MIDDLE, BIRTHDATE, SEX, EMAIL, PRIMARY_PHONE
from applications
where first = 'Rob' and last = 'Robot';
But you still have to list out the columns once.
Some more modern databases support an except clause that lets you remove columns from the wildcard list. To the best of my knowledge, Oracle has no similar concept.
You could write a query to bring the columns together from the system tables. That could simplify writing the query and help prevent misspellings.

using count and group by correctly

I have a relation CandyC(id, email, age, name, candy_id)
I want to count the CandyC.ids associated with a CandyC.candy_id once.
Attempt:
SELECT email, age, name
FROM CandyC
GROUP BY id
HAVING COUNT(DISTINCT candy_id) = 1;
It gives me an error:
not a group by expression
The group by clause need to have all the non aggregated columns selected directly. Also, it's usually a good idea to use having after the group by as it's the standard way of writing this (even though Oracle supports it the other way too).
Does this do what you want:
select email, age, name
from candyc
group by id, email, age, name
having count(distinct candy_id) = 1
If not, you should provide sample data and expected results in your question to clarify.
I think you want something more like this:
SELECT candy_id, COUNT(*)
FROM CandyC
GROUP BY candy_id;
I don't know what the email/age/name columns have to do with the question:
I want to count the CandyC.ids associated with a CandyC.candy_id once.

How to select different columns depending on values in a column

So, I have a table with the columns staff, associate and Matter type (which is always either set to 'Lit' or 'PSD'.)
When type field = 'Lit' I need to include the Staff field as the staff field in the select statement. When the type field is set to 'PSD' I need to include the associate field as the staff field in the select statement.
I know I can do this as two separate queries, but I cannot figure out how to combine the two into a single query - there's an easy answer, but after not being able to figure it out for a while, I'm just not coming up with the answer.
Ideas?
SELECT
CASE WHEN
[MatterType] = 'Lit'
THEN
[Staff]
ELSE
[Associate]
END AS [NewStaff]
FROM
MyTable;
This uses an inline case condition in the SELECT list.
To combine the results of two queries with same number of columns, you can use UNION ALL or UNION. Preferably union all because of less overhead.
SELECT staff AS staff ,
mattertype
FROM my_table
WHERE mattertype = 'Lit'
UNION ALL
SELECT associate AS staff ,
mattertype
FROM my_table
WHERE mattertype = 'PSD'
In your case, I would say using CASE is better:
SELECT CASE WHEN mattertype = 'Lit' THEN staff
ELSE associate
END AS staff
,mattertype
FROM my_table
If I understand your question right you want either staff or associate as a column called staff depending on the value of matter. If this is the cas you can use a conditional case ... when statement to select the appropriate columns. Something like this:
select matter, case when matter = 'Lit' then staff else associate end as staff from table
As you state that matter has to be either Lit or PSD you only need to check if it is one of the values, otherwise it has to be the other (although you could make the check explicit for clarity).
The other answers have covered the common & practical, so here is a variation which is sometimes useful. If your staff column is null when [Matter type] = 'PSD' then this would work:
SELECT COALESCE(staff,associate) AS staff
FROM tablename
;

SQL to get rows (not groups) that match an aggregate

Given table USER (name, city, age), what's the best way to get the user details of oldest user per city?
I have seen the following example SQL used in Oracle which I think it works
select name, city, age
from USER, (select city as maxCity, max(age) as maxAge
from USER
group by city)
where city=maxCity and age=maxAge
So in essence: use a nested query to select the grouping key and aggregate for it, then use it as another table in the main query and join with the grouping key and the aggregate value for each key.
Is this the standard SQL way of doing it? Is it any quicker than using a temporary table, or is in fact using a temporary table interanlly anyway?
What you are using will work, although it displays all users which share the max age.
You can do this in a slightly more readable way using the row_number() ranking function:
select name, city, age
from (
select
city
, age
, row_number() over (partition by city order by age) as rn
from USER
) sub
where rn = 1
This will also select at most one user per city.
Most database systems will use a temporary table to store the inner query. So I don't think a temporary table would speed it up. But database performance is notoriously hard to predict from a distance :)

Return all Fields and Distinct Rows

Whats the best way to do this, when looking for distinct rows?
SELECT DISTINCT name, address
FROM table;
I still want to return all fields, ie address1, city etc but not include them in the DISTINCT row check.
Then you have to decide what to do when there are multiple rows with the same value for the column you want the distinct check to check against, but with different val;ues in the other columns. In this case how does the query processor know which of the multiple values in the other columns to output, if you don't care, then just write a group by on the distinct column, with Min(), or Max() on all the other ones..
EDIT: I agree with comments from others that as long as you have multiple dependant columns in the same table (e.g., Address1, Address2, City, State ) That this approach is going to give you mixed (and therefore inconsistent ) results. If each column attribute in the table is independant ( if addresses are all in an Address Table and only an AddressId is in this table) then it's not as significant an issue... cause at least all the columns from a join to the Address table will generate datea for the same address, but you are still getting a more or less random selection of one of the set of multiple addresses...
This will not mix and match your city, state, etc. and should give you the last one added even:
select b.*
from (
select max(id) id, Name, Address
from table a
group by Name, Address) as a
inner join table b
on a.id = b.id
When you have a mixed set of fields, some of which you want to be DISTINCT and others that you just want to appear, you require an aggregate query rather than DISTINCT. DISTINCT is only for returning single copies of identical fieldsets. Something like this might work:
SELECT name,
GROUP_CONCAT(DISTINCT address) AS addresses,
GROUP_CONCAT(DISTINCT city) AS cities
FROM the_table
GROUP BY name;
The above will get one row for each name. addresses contains a comma delimted string of all the addresses for that name once. cities does the sames for all the cities.
However, I don't see how the results of this query are going to be useful. It will be impossible to tell which address belongs to which city.
If, as is often the case, you are trying to create a query that will output rows in the format you require for presentation, you're much better off accepting multiple rows and then processing the query results in your application layer.
I don't think you can do this because it doesn't really make sense.
name | address | city | etc...
abc | 123 | def | ...
abc | 123 | hij | ...
if you were to include city, but not have it as part of the distinct clause, the value of city would be unpredictable unless you did something like Max(city).
You can do
SELECT DISTINCT Name, Address, Max (Address1), Max (City)
FROM table
Use #JBrooks answer below. He has a better answer.
Return all Fields and Distinct Rows
If you're using SQL Server 2005 or above you can use the RowNumber function. This will get you the row with the lowest ID for each name. If you want to 'group' by more columns, add them in the PARTITION BY section of the RowNumber.
SELECT id, Name, Address, ...
(select id, Name, Address, ...,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY id) AS RowNo
from table) sub
WHERE RowNo = 1