I have a user database where users are assigned an ID number in a certain range according to their type. For example, board members get an ID between 1 and 100, children get an ID between 1001 and 3000, parents get an ID between 3001 and 7000 etc.*
I'd like to get a list of the highest number in use for each "segment" of my IDs.
I can of course get the highest number of all by doing
SELECT MAX(Persons.Number) as Maximum FROM Persons
and get the highest number below 3000 like this:
SELECT MAX(Persons.Number) as MaxChild FROM Persons WHERE Persons.Number<=3000
...but how could I get the highest number below 100 AND the highest number below 1000 AND the highest number below 3000 etc. etc. with a single SELECT statement?
*I do have these characteristics stored in the database elsewhere; the "bucketing" of ID numbers is just for making it easier to spot at first glance where a certain user belongs
Just use IF():
SELECT
MAX(IF(Persons.Number BETWEEN x AND y, Persons.Number, NULL)) AS max_range_x_y,
MAX(IF(Persons.Number BETWEEN i AND j, Persons.Number, NULL)) AS max_range_i_j, ...
FROM Persons;
Above is MySQL syntax. In SQL Server you might use IIF() instead. What should work in every RDBMS (because it's ANSI-SQL Standard) is
SELECT
MAX(CASE WHEN Persons.Number BETWEEN x AND y THEN Persons.Number ELSE NULL END) AS max_range_x_y,
MAX(CASE WHEN Persons.Number BETWEEN i AND j THEN Persons.Number ELSE NULL END) AS max_range_i_j, ...
FROM Persons;
SELECT
MAX(CASE WHEN id BETWEEN 1 and 100 THEN Number ELSE NULL END) as BoardMax,
MAX(CASE WHEN ID BETWEEN 1001 and 3000 THEN Number ELSE null END) as ChildMax,
MAX(CASE WHEN ID BETWEEN 3001 and 7000 THEN Number ELSE null END) as ParentMax
from
Persons
Related
I have three columns, all consisting of 1's and 0's. For each of these columns, how can I calculate the percentage of people (one person is one row/ id) who have a 1 in the first column and a 1 in the second or third column in oracle SQL?
For instance:
id marketing_campaign personal_campaign sales
1 1 0 0
2 1 1 0
1 0 1 1
4 0 0 1
So in this case, of all the people who were subjected to a marketing_campaign, 50 percent were subjected to a personal campaign as well, but zero percent is present in sales (no one bought anything).
Ultimately, I want to find out the order in which people get to the sales moment. Do they first go from marketing campaign to a personal campaign and then to sales, or do they buy anyway regardless of these channels.
This is a fictional example, so I realize that in this example there are many other ways to do this, but I hope anyone can help!
The outcome that I'm looking for is something like this:
percentage marketing_campaign/ personal campaign = 50 %
percentage marketing_campaign/sales = 0%
etc (for all the three column combinations)
Use count, sum and case expressions, together with basic arithmetic operators +,/,*
COUNT(*) gives a total count of people in the table
SUM(column) gives a sum of 1 in given column
case expressions make possible to implement more complex conditions
The common pattern is X / COUNT(*) * 100 which is used to calculate a percent of given value ( val / total * 100% )
An example:
SELECT
-- percentage of people that have 1 in marketing_campaign column
SUM( marketing_campaign ) / COUNT(*) * 100 As marketing_campaign_percent,
-- percentage of people that have 1 in sales column
SUM( sales ) / COUNT(*) * 100 As sales_percent,
-- complex condition:
-- percentage of people (one person is one row/ id) who have a 1
-- in the first column and a 1 in the second or third column
COUNT(
CASE WHEN marketing_campaign = 1
AND ( personal_campaign = 1 OR sales = 1 )
THEN 1 END
) / COUNT(*) * 100 As complex_condition_percent
FROM table;
You can get your percentages like this :
SELECT COUNT(*),
ROUND(100*(SUM(personal_campaign) / sum(count(*)) over ()),2) perc_personal_campaign,
ROUND(100*(SUM(sales) / sum(count(*)) over ()),2) perc_sales
FROM (
SELECT ID,
CASE
WHEN SUM(personal_campaign) > 0 THEN 1
ELSE 0
end AS personal_campaign,
CASE
WHEN SUM(sales) > 0 THEN 1
ELSE 0
end AS sales
FROM the_table
WHERE ID IN
(SELECT ID FROM the_table WHERE marketing_campaign = 1)
GROUP BY ID
)
I have a bit overcomplicated things because your data is still unclear to me. The subquery ensures that all duplicates are cleaned up and that you only have for each person a 1 or 0 in marketing_campaign and sales
About your second question :
Ultimately, I want to find out the order in which people get to the
sales moment. Do they first go from marketing campaign to a personal
campaign and then to sales, or do they buy anyway regardless of these
channels.
This is impossible to do in this state because you don't have in your table, either :
a unique row identifier that would keep the order in which the rows were inserted
a timestamp column that would tell when the rows were inserted.
Without this, the order of rows returned from your table will be unpredictable, or if you prefer, pure random.
I am trying to create a table to will count the occurrences of each position for various offices.
So if my data is as follows:
Office Position
A Manager
A Supervisor
A Entry Level
A Entry Level
B Manager
B Entry Level
I would want my code to return:
Office Managers Supervisors EntryLevel
A 1 1 2
B 1 0 1
I have my code below. The issue is that this code counts the total amount of occurrences, not the unique count to each office. The results are as follows
A 2 1 3
B 2 1 3
CREATE TABLE OfficeTest AS
SELECT DISTINCT Office,
(Select COUNT(Position) FROM OfficeData WHERE Make_Name = 'Manager') as Managers,
(Select COUNT(Position) FROM OfficeData WHERE Make_Name = 'Supervisor') as Supervisors,
(Select COUNT(Position) FROM OfficeData WHERE Make_Name = 'Entry Level') as EntryLevel
FROM OfficeData
GROUP BY Office;
Any ideas on how to fix this?
The easiest way I can think of doing this is like this:
SELECT Office,
COUNT(CASE WHEN Make_Name = 'Manager' THEN Position END) AS Managers,
COUNT(CASE WHEN Make_Name = 'Supervisor' THEN Position END) AS Supervisors,
COUNT(CASE WHEN Make_Name = 'Entry Level' THEN Position END) AS EntryLevel
FROM OfficeData
GROUP BY Office
COUNT ignores MISSING values; if the Position is not the one specified in the CASE clause, it will return a MISSING value and won't be counted. This way each case considers only the value of Position you compare.
Another option, as stated in the comments, would be pivoting the table. The SAS equivalent is the TRANSPOSE procedure. I don't have a SAS system to create and test a query using it, but here's the documentation in case you want to check it out.
Just to flush out Danny's comment a bit, the SUM code would look like:
proc sql;
CREATE TABLE want AS
SELECT office,
SUM( (position='Manager') ) as Managers,
SUM( (position='Supervisor') ) as Supervisors,
SUM( (position='Entry Level') ) as EntryLevel
FROM OfficeData
GROUP BY office
;quit;
The (position='Manager') bit resolves to 0 or 1, depending on if its true for the current record. I find the SUM version a lot more concise and legible, but both should work for your situation. Plus, its easily extensible to more than one criteria, like (postion='Manager')*(sex='F') to count only female managers.
SUM with CASE statement should resolve the issue. Below is a reference code
proc sql;
create table result as
select age
, sum(case sex when 'F' then 1 else 0 end) as Female
, sum(case sex when 'M' then 1 else 0 end) as Male
from sashelp.class
group by age;
quit;
proc print data=result;run;
After many attempts I have failed at this and hoping someone can help. The query returns every entry a user makes when items are made in the factory against and order number. For example
Order Number Entry type Quantity
3000 1 1000
3000 1 500
3000 2 300
3000 2 100
4000 2 1000
5000 1 1000
What I want to the query do is to return filter the results like this
If the order number has an entry type 1 and 2 return the row which is type 1 only
otherwise just return row whatever the type is for that order number.
So the above would end up:
Order Number Entry type Quantity
3000 1 1000
3000 1 500
4000 2 1000
5000 1 1000
Currently my query (DB2, in very basic terms looks like this ) and was correct until a change request came through!
Select * from bookings where type=1 or type=2
thanks!
select * from bookings
left outer join (
select order_number,
max(case when type=1 then 1 else 0 end) +
max(case when type=2 then 1 else 0 end) as type_1_and_2
from bookings
group by order_number
) has_1_and_2 on
type_1_and_2 = 2
has_1_and_2.order_number = bookings.order_number
where
bookings.type = 1 or
has_1_and_2.order_number is null
Find all the orders that have both type 1 and type 2, and then join it.
If the row matched the join, only return it if it is type 1
If the row did not match the join (has_type_2.order_number is null) return it no matter what the type is.
A "common table expression" [CTE] can often simplify your logic. You can think of it as a way to break a complex problem into conceptual steps. In the example below, you can think of g as the name of the result set of the CTE, which will then be joined to
WITH g as
( SELECT order_number, min(type) as low_type
FROM bookings
GROUP BY order_number
)
SELECT b.*
FROM g
JOIN bookings b ON g.order_number = b.order_number
AND g.low_type = b.type
The JOIN ON conditions will work so that if both types are present then low_type will be 1, and only that type of record will be chosen. If there is only one type it will be identical to low_type.
This should work fine as long as 1 and 2 are the only types allowed in the bookings table. If not then you can simply add a WHERE clause in the CTE and in the outer SELECT.
I need to calculate sum of occurences of some data in two columns in one query. DB is in SQL Server 2005.
For example I have this table:
Person: Id, Name, Age
And I need to get in one query those results:
1. Count of Persons that have name 'John'
2. Count of 'John' with age more than 30 y.
I can do that with subqueries in this way (it is only example):
SELECT (SELECT COUNT(Id) FROM Persons WHERE Name = 'John'),
(SELECT COUNT (Id) FROM Persons WHERE Name = 'John' AND age > 30)
FROM Persons
But this is very slow, and I'm searching for faster method.
I found this solution for MySQL (it almost solve my problem, but it is not for SQL Server).
Do you know better way to calculate few counts in one query than using subqueries?
Using a CASE statement lets you count whatever you want in a single query:
SELECT
SUM(CASE WHEN Persons.Name = 'John' THEN 1 ELSE 0 END) AS JohnCount,
SUM(CASE WHEN Persons.Name = 'John' AND Persons.Age > 30 THEN 1 ELSE 0 END) AS OldJohnsCount,
COUNT(*) AS AllPersonsCount
FROM Persons
Use:
SELECT COUNT(p.id),
SUM(CASE WHEN p.age > 30 THEN 1 ELSE 0 END)
FROM PERSONS p
WHERE p.name = 'John'
It's always preferable when accessing the same table more than once, to review for how it can be done in a single pass (SELECT statement). It won't always be possible.
Edit:
If you need to do other things in the query, see Chris Shaffer's answer.
I have a table with a column that allows nulls. If the value is null it is incomplete. I want to calculate the percentage complete.
Can this be done in MySQL through SQL or should I get the total entries and the total null entries and calculate the percentage on the server?
Either way, I'm very confused on how I need to go about separating the variable_value so that I can get its total results and also its total NULL results.
SELECT
games.id
FROM
games
WHERE
games.category_id='10' AND games.variable_value IS NULL
This gives me all the games where the variable_value is NULL. How do I extend this to also get me either the TOTAL games or games NOT NULL along with it?
Table Schema:
id (INT Primary Auto-Inc)
category_id (INT)
variable_value (TEXT Allow Null Default: NULL)
When you use "Count" with a column name, null values are not included. So to get the count or percent not null just do this...
SELECT
count(1) as TotalAll,
count(variable_value) as TotalNotNull,
count(1) - count(variable_value) as TotalNull,
100.0 * count(variable_value) / count(1) as PercentNotNull
FROM
games
WHERE
category_id = '10'
SELECT
SUM(CASE WHEN G.variable_value IS NOT NULL THEN 1 ELSE 0 END)/COUNT(*) AS pct_complete
FROM
Games G
WHERE
G.category_id = '10'
You might need to do some casting on the SUM() so that you get a decimal.
To COUNT the number of entries matching your WHERE statement, use COUNT(*)
SELECT COUNT(*) AS c FROM games WHERE games.variable_value IS NULL
If you want both total number of rows and those with variable_value being NULL in one statement, try GROUP BY
SELECT COUNT(variable_value IS NULL) AS c, (variable_value IS NULL) AS isnull FROM games GROUP BY isnull
Returns something like
c | isnull
==============
12 | 1
193 | 0
==> 12 entries have NULL in that column, 193 havn't
==> Percentage: 12 / (12 + 193)