How can I get Access SQL to return a dataset of the largest value in each category? - sql

This has been driving me crazy all day, and I've gone through every solution I can find on here. This should be a very simple thing.
I have a table in Access that contains a list of applications:
ApplicantNumber | Region
There are many more columns, but those are the two I care about at the moment. Each row is a separate application, and each applicant can submit multiple applications.
I have a query in Access that finds the count per applicant of applications in each region:
ApplicantNumber | Region | CountOfAPplications
How the ##&*!!! do I pull out of that the region with the most applications for each ApplicantNumber?
As far as I can tell, the following should work fine but it just provides the same output as the initial query with the full count per applicant:
SELECT myQry.ApplicantNumber, myQRY.Region, Max(myQRY.CountOfRegion)
FROM (SELECT AppliedCensusBlocks.ApplicantNumber, AppliedCensusBlocks.Region, Count(AppliedCensusBlocks.Region) AS CountOfRegion
FROM AppliedCensusBlocks
GROUP BY AppliedCensusBlocks.ApplicantNumber, AppliedCensusBlocks.Region) AS myQRY
GROUP BY myQry.ApplicantNumber, myQry.Region
What am I doing wrong? If I remove the Region field, Access will work as I'd expect and just show the ApplicantNumber and maximum count. BUt I'm really trying to get at the region name associated with the maximum count.

This is a bit tricky. MS Access is not the best suited for this sort of query. But here is one way
SELECT acb.ApplicantNumber, acb.Region, Count(*) AS CountOfRegion
FROM AppliedCensusBlocks as acb
GROUP BY acb.ApplicantNumber, acb.Region
HAVING COUNT(*) = (SELECT TOP 1 COUNT(*)
FROM AppliedCensusBlocks as acb2
WHERE acb2.ApplicantNumber = acb.ApplicantNumber
GROUP BY acb2.Region
ORDER BY COUNT(*) DESC, acb2.Region
);

SELECT TOP 1 ApplicantNumber, Region, COUNT(*) AS Applications
FROM AppliedCensusBlocks
GROUP BY ApplicantNumber, Region
ORDER BY COUNT(*) DESC

Related

PostgreSQL: reuse Column Data In Different Column Of The Same Query

I'm trying to create a SELECT query that does several calculated fields on one of two tables. I'm new to SQL (I've looked at several free online tutorials, so I have a general idea), but I think my goal is a little out of my skill range.
I have two tables:
TreeRecord with columns ID (serial), Site (chr)
Each ID represents an individual tree.
TreeHistory with columns ID (serial), TreeID (int), DBH (int)
DBH is tree diameter.
Currently I can create this:
| Site | Total tree count of site | Avg DBH of site |
I would like to have another column that can give the total count of trees over a particular size for each site. I can recreate this in a simple query, and my research on stack (SQL Select - Calculated Column if Value Exists in another Table) makes me feel that a nested SELECT is what I'm after but I can't get that to work. My current code is this:
SELECT
"TreeRecord"."Site",
count("TreeRecord".*) AS Total_Count,
round(avg("TreeHistory"."DBH"), 0) AS Average_DBH
FROM
"TreeRecord"
LEFT OUTER JOIN
"TreeHistory" ON "TreeRecord"."ID" = "TreeHistory"."TreeID"
GROUP BY
"Site"
ORDER BY
"Site" ASC;
Any help on this would be most appreciated.
Thank you
Use count with the specific size condition.
SELECT "TreeRecord"."Site",
count("TreeRecord".*) AS Total_Count,
round(avg("TreeHistory"."DBH"),0) AS Average_DBH,
count(case when "TreeHistory"."DBH" > 10 then 1 end) as count_over_specific_size
^^--change this size accordingly
FROM "TreeRecord"
LEFT OUTER JOIN "TreeHistory"
ON "TreeRecord"."ID" = "TreeHistory"."TreeID"
GROUP BY "Site"
ORDER BY "Site" ASC;

SQL Nested Query with distinct count

I have a dilemma, and I'm hoping someone will be able to help me out. I am attempting to work on some made up problems from an old text book of mine, this isn't a question from the book, but the data is, I just wanted to see if I could still work in SQL, so here goes. When this code is executed,
SELECT COUNT(code_description) "Number of Different Crimes", last, first,
code_description
FROM
(
SELECT criminal_id, last, first, crime_code, code_description
FROM criminals
JOIN crimes USING (criminal_id)
JOIN crime_charges USING (crime_id)
JOIN crime_codes USING (crime_code)
ORDER BY criminal_id
)
WHERE criminal_id = 1020
GROUP BY last, first, code_description;
I am provided with these results:
Number of Different Crimes LAST FIRST CODE_DESCRIPTION
1 Phelps Sam Agg Assault
1 Phelps Sam Drug Offense
Inevitably, I would like the number of different crimes to be 2 for each line since this criminal has two unique crimes charged to him. I would like it to be displayed something like:
Number of Different Crimes LAST FIRST CODE_DESCRIPTION
2 Phelps Sam Agg Assault
2 Phelps Sam Drug Offense
Not to push my luck but I would also like to get rid of the follow line also:
WHERE criminal_id = 1020
to something a little more elegant to represent any criminal with more than 1 crime type associated with them, for this case, Sam Phelps is the only one in this data set.
As #sgeddes said in a comment, you can use an analytic count, which doesn't need a subquery if you're specifying the criminal ID:
SELECT COUNT(code_description) OVER (PARTITION BY first, last) AS "Number of Different Crimes",
last, first, code_description
FROM criminals
JOIN crimes USING (criminal_id)
JOIN crime_charges USING (crime_id)
JOIN crime_codes USING (crime_code)
WHERE criminal_id = 1020;
If you want to look for anyone with multiple crimes then you do need a subquery so you can filter on the analytic result:
SELECT charge_count AS "Number of Different Crimes",
last, first, code_description
FROM (
SELECT COUNT(DISTINCT code_description) OVER (PARTITION BY first, last) AS charge_count,
criminal_id, last, first, code_description
FROM criminals
JOIN crimes USING (criminal_id)
JOIN crime_charges USING (crime_id)
JOIN crime_codes USING (crime_code)
)
WHERE charge_count > 1
ORDER BY criminal_id, code_description;
SQL Fiddle demo.
If the charges are across multiple crimes, but duplicated, then the distinct count still works, but you might want to make add a distinct to the overall result set - unless you want to show other crime-specific info - otherwise you get something like this.

Oracle Data Gaps

im looking for a query to fill this condition:
That currently gives us the number of BACs at the entity (which is something we need). The database assigns the BAC IDs consecutively within each accounting entity. So we need to add one more field to the query showing the current highest BAC ID at the entity. And once we have that, just filter the results down to anyplace the number of records doesn't equal the highest ID.
My current query:
select accounting_entity_id, count(bac_id)
from dc.pl_bac_information
group by accounting_entity_id
having count(bac_id) > 1;
Use analytic functions for this:
select bi.*
from (select bi.*, max(bac_id) over (partition by accounting_entity_id) as max_bac_id
from dc.pl_bac_information bi
) bi
where bac_id = max_bac_id;
This assumes you are using Oracle.
SELECT ACCOUNTING_ENTITY_ID
FROM DC.PL_BAC_INFORMATION
HAVING COUNT(BAC_ID) > 1 AND COUNT(BAC_ID) != MAX(BAC_ID)
GROUP BY ACCOUNTING_ENTITY_ID;

Getting a unique value from an aggregated result set

I've got an aggregated query that checks if I have more than one record matching certain conditions.
SELECT RegardingObjectId, COUNT(*) FROM [CRM_MSCRM].[dbo].[AsyncOperationBase] a
where WorkflowActivationId IN ('55D9A3CF-4BB7-E311-B56B-0050569512FE',
'1BF5B3B9-0CAE-E211-AEB5-0050569512FE',
'EB231B79-84A4-E211-97E9-0050569512FE',
'F0DDF5AE-83A3-E211-97E9-0050569512FE',
'9C34F416-F99A-464E-8309-D3B56686FE58')
and StatusCode = 10
group by RegardingObjectId
having COUNT(*) > 1
That's nice, but then there is one field in AsyncOperationBase that will be unique. Say count(*) = 3, well, AsyncOperationBaseId in AsyncOperationBase will have 3 different values since AsyncOperationBase is the table's primary key.
To be honest, I would not even know what terms and expressions to Google to find a solution.
If anyone has a solution and also, is there any words to describe what I'm looking for ? Perhaps BI people are often faced with such a requirement or something...
I could do it with an SSRS report where the report would visually do the grouping then I could expand each grouped row to get the AsyncOperationBaseId value, but simply through SQL, I can't seem to find a way out...
Thanks.
select * from [CRM_MSCRM].[dbo].[AsyncOperationBase]
where RegardingObjectId in
(
SELECT RegardingObjectId
FROM [CRM_MSCRM].[dbo].[AsyncOperationBase] a
where WorkflowActivationId IN
(
'55D9A3CF-4BB7-E311-B56B-0050569512FE',
'1BF5B3B9-0CAE-E211-AEB5-0050569512FE',
'EB231B79-84A4-E211-97E9-0050569512FE',
'F0DDF5AE-83A3-E211-97E9-0050569512FE',
'9C34F416-F99A-464E-8309-D3B56686FE58'
)
and StatusCode = 10
group by RegardingObjectId
having COUNT(*) > 1
)

Finding most popular and most unique records using SQL

My mom wanted a baby name game for my brother's baby shower. Wanting to learn python, I volunteered to do it. I pretty much have the python bit, it's the SQL that is throwing me.
The way the game is supposed to work is everyone at the shower writes down names on paper, I manually enter them into Excel (normalizing spellings as much as possible) and export to MS Access. Then I run my python program to find the player with the most popular names and the player with the most unique names. The database, called "babynames", is just four columns.
ID | BabyFirstName | BabyMiddleName | PlayerName
---|---------------|----------------|-----------
My mom has changed things every so often, but as they stand right now, I have to figure out :
a) The most popular name (or names if there is a tie) out of all first and middle names
b) The most unique name (or names if there is a tie) out of all the first and middle names
c) The player that has the most number of popular names (wins a prize)
d) The player that has the most number of unique names (wins a prize)
I've been working on this for about a week now and can't even get a SQL query for a) and b) to work, much less c) and d). I'm more than just a bit frustrated.
BTW, I'm just looking at spellings of the names, not phonetics. As I manually enter names, I will change names like "Kris" to "Chris" and "Xtina" to "Christina" etc.
Editing to add a couple of the most recent queries I tried for a)
SELECT [BabyFirstName],
COUNT ([BabyFirstName]) AS 'FirstNameOccurrence'
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY 'FirstNameOccurrence' DESC
LIMIT 1
and
SELECT [BabyFirstName]
FROM [babynames]
GROUP BY [BabyFirstName]
HAVING COUNT(*) =
(SELECT COUNT(*)
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY COUNT(*) DESC
LIMIT 1)
These both lead to syntax errors.
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error in ORDER BY clause. (-3508) (SQLExecDirectW)')
I've tried using [FirstNameOccurrence] and just FirstNameOccurrence as well with the same error. Not sure why it's not recognizing it by that column name to order by.
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error. in query expression 'COUNT(*) = (SELECT COUNT(*) FROM [babynames] GROUP BY [BabyFirstName] ORDER BY COUNT(*) DESC LIMIT 1)'. (-3100) (SQLExecDirectW)")
I'll admit that I'm not really grokking all of the COUNT(*) commands here, but this was a solution for a similar issue here in stackoverflow that I figured I'd try when my other idea didn't pan out.
For A and B, use a group by clause in your SQL, and then count, and order by the count. Use descending order for A and ascending order for B, and just take the first result for each.
For C and D, essentially use the same strategy but now just add the PlayerName (e.g. group by babyname,playername) and then use the ascending order/descending order question.
Here's Microsoft's write-up for a group by clause in MS Access: https://office.microsoft.com/en-us/access-help/group-by-clause-HA001231482.aspx
Here's an even better write-up demonstrating how to do both group by and order by at the same time: http://rogersaccessblog.blogspot.com/2009/06/select-queries-part-3-sorting-and.html
For the first query you tried, change it to:
SELECT TOP 1 [BabyFirstName],
COUNT ([BabyFirstName]) AS 'FirstNameOccurrence'
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY 'FirstNameOccurrence' DESC
For the second, change it to:
SELECT [BabyFirstName]
FROM [babynames]
GROUP BY [BabyFirstName]
HAVING COUNT(*) =
(SELECT TOP 1 COUNT(*)
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY COUNT(*) DESC)
Limiting the number of records returned by a SQL Statement in Access is achieved by adding a TOP statement directly after SELECT, not with ORDER BY... LIMIT
Also, Access TOP statement will return all instances of the top n (or n percent) unique records, so if there are two or more identical records in the query output (before TOP), and TOP 1 is specified, you'll see them all.