Count distinct query MS Access - sql

It seems that we can not use Count (Distinct column) function in MS Access. I have following data and expected result as shown below
Looking for MS Access query which can give required result.
Data
ID Name Category Person Office
1 FIL Global Ben london
1 FIL Global Ben london
1 FIL Overall Ben Americas
106 Asset Global Ben london
156 ICICI Overall Rimmer london
156 ICICI Overall Rimmer london
188 UBS Overall Rimmer london
9 Fund Global Rimmer london
Expected Result
Person Global_Cnt Overall_Cnt
Ben 2 1
Rimmer 1 2

Use a subquery to select the distinct values from your table.
In the parent query, GROUP BY Person, and use separate Count() expressions for each category. Count() only counts non-Null values, so use IIf() to return 1 for the category of interest and Null otherwise.
SELECT
sub.Person,
Count(IIf(Category = 'Global', 1, Null)) AS Global_Cnt,
Count(IIf(Category = 'Overall', 1, Null)) AS Overall_Cnt
FROM
(
SELECT DISTINCT ID, Category, Person
FROM YourTable
) AS sub
GROUP BY sub.Person;
I was unsure which fields identify your unique values, so chose ID, Category, and Person. The result set from the query matches what you asked for; change the SELECT DISTINCT field list if it doesn't fit with your actual data.

When creating a query in Microsoft Access, you might want to return only distinct or unique values. There are two options in the query's property sheet, "Unique Values" and "Unique Records":
DISTINCT and DISTINCTROW sometimes provide the same results, but there are significant differences:
DISTINCT
DISTINCT checks only the fields listed in the SQL string and then eliminates the duplicate rows. Results of DISTINCT queries are not updateable. They are a snapshot of the data.
DISTINCT queries are similar to Summary or Totals queries (queries using a GROUP BY clause).
DISTINCTROW
DISTINCTROW, on the other hand, checks all fields in the table that is being queried, and eliminates duplicates based on the entire record (not just the selected fields). Results of DISTINCTROW queries are updateable.
Read More...

MS Access-Engine does not support
SELECT count(DISTINCT....) FROM ...
You have to do it like this:
SELECT count(*)
FROM
(SELECT DISTINCT Name FROM table1)
Its a little workaround... you're counting a DISTINCT selection.

select count(column) as guessTable
from
(
select distinct column from Table
)

Related

How to aggregate data stored column-wise in a matrix table

I have a table, Ellipses (...), represent multiple columns of a similar type
TABLE: diagnosis_info
COLUMNS: visit_id,
patient_diagnosis_code_1 ...
patient_diagnosis_code_100 -- char(100) with a value of ‘0’ or ‘1’
How do I find the most common diagnosis_code? There are 101 columns including the visit_id. The table is like a matrix table of 0s and 1s. How do I write something that can dynamically account for all the columns and count all the rows where the value is 1?
What I would normally do is not feasable as there are too many columns:
SELECT COUNT(patient_diagnostic_code_1), COUNT(patient_diagnostic_code_2),... FROM diagnostic_info WHERE patient_diagnostic_code_1 = ‘1’ and patient_diagnostic_code_2 = ‘1’ and ….
Then even if I typed all that out how would I select which column had the highest count of values = 1. The table is more column oriented instead of row oriented.
Unfortunately your data design is bad from the start. Instead it could be as simple as:
patient_id, visit_id, diagnosis_code
where a patient with 1 dignostic code would have 1 row, a patient with 100 diagnostic codes 100 rows and vice versa. At any given time you could transpose this into the format you presented (what is called a pivot or cross tab). Also in some databases, for example postgreSQL, you could put all those diagnostic codes into an array field, then it would look like:
patient_id, visit_id, diagnosis_code (data type -bool or int- array)
Now you need the reverse of it which is called unpivot. On some databases like SQL server there is UNPIVOT as an example.
Without knowing what your backend this, you could do that with an ugly SQL like:
select code, pdc
from
(
select 1 as code, count(*) as pdc
from myTable where patient_diagnosis_code_1=1
union
select 2 as code, count(*) as pdc
from myTable where patient_diagnosis_code_2=1
union
...
select 100 as code, count(*) as pdc
from myTable where patient_diagnosis_code_100=1
) tmp
order by pdc desc, code;
PS: This would return all the codes with their frequency ordered from most to least. You could limit to get 1 to get the max (with ties in case there are more than one code to match the max).

SQL for Middle Value Rather than MIN/MAX or FIRST/LAST

Is there a SQL function to return the middle value of three?
For example, assume I have a table with people who have three cars, sorted alphabetically by AutoMaker.
John: Ford
John: Honda
John: VW
then
MIN(AutoMaker) returns Ford.
MAX(AutoMaker) returns VW.
Is there a similar SQL function that will return Honda?
I am working with MS Access and Oracle.
Thank you.
Short answer: No. It's too specific.
Longer answer: It's too specific. Hence, the "middle" in what you said is actually the second record. But if you had 5 records, it would be the third, and so on. If you need that in practice, just assign a row number to each row (Oracle, Access) and then select the ((n+1)/2)nd row (WHERE row_number = (n+1)/2).
PS - which is the middle row if you have 4 rows? :)
The query could be something like this
select row_id, Field1
FROM tbl
where row_id = (select cInt(count(Field1)/2) from tbl)
The problem in access is that you do not have a row_number you would need to add a row_id to the table and then populate row_id 1,2,3,4 (ordered on Field1)

SQL - Insert using Column based on SELECT result

I currently have a table called tempHouses that looks like:
avgprice | dates | city
dates are stored as yyyy-mm-dd
However I need to move the records from that table into a table called houses that looks like:
city | year2002 | year2003 | year2004 | year2005 | year2006
The information in tempHouses contains average house prices from 1995 - 2014.
I know I can use SUBSTRING to get the year from the dates:
SUBSTRING(dates, 0, 4)
So basically for each city in tempHouses.city I need to get the the average house price from the above years into one record.
Any ideas on how I would go about doing this?
This is an SQL Server approach, and a PIVOT may be a better, but here's one way:
SELECT City,
AVG(year2002) AS year2002,
AVG(year2003) AS year2003,
AVG(year2004) AS year2004
FROM (
SELECT City,
CASE WHEN Dates BETWEEN '2002-01-01T00:00:00' AND '2002-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2002,
CASE WHEN Dates BETWEEN '2003-01-01T00:00:00' AND '2003-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2003
CASE WHEN Dates BETWEEN '2004-01-01T00:00:00' AND '2004-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2004
-- Repeat for each year
)
GROUP BY City
The inner query gets the data into the correct format for each record (City, year2002, year2003, year2004), whilst the outer query gets the average for each City.
There many be many ways to do this, and performance may be the deciding factor on which one to choose.
The best way would be to use a script to perform the query execution for you because you will need to run it multiple times and you extract the data based on year. Make sure that the only required columns are city & row id:
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
INSERT INTO <table> (city) VALUES SELECT DISTINCT `city` from <old_table>;
Then for each city extract the average values, insert them into a temporary table and then insert into the main table.
SELECT avg(price), substring(dates, 0, 4) dates from <old_table> GROUP BY dates;
Otherwise you're looking at a combination query using joins and potentially unions to extrapolate the data. Because you're flattening the table into a single row per city it's going to be a little tough to do. You should create indexes first on the date column if you don't want the database query to fail with memory limits or just take a very long time to execute.

Multiple Distinct Values with SUM

i have a table which Data is like
userID name amount Date
1 mark 20 22-10
1 mark 30 22-10
2 kane 50 22-12
2 kane 60 22-12
3 mike 60 22-10
Date is Unique with combination of userID + Username + Date
but as its more then 100k records there maybe duplicate date records but wither other user ids and names not with
now i want Output like
userID name amount Date
1 mark 50 22-10
2 kane 110 22-12
3 mike 60 22-10
if i try group By with id,name,sum(amount),date it returns multiple rows and answer incorrect
i have tried various combination of distict and SUM etc etc but not succeeded
any Solution
Thanks
You typically use a GROUP BY to aggregate one or more fields by one or more other fields.
SELECT userID, Name, SUM(amount), MIN(date)
FROM YourTable
GROUP BY
userID, Name
From MSDN: Aggregate functions
Aggregate functions perform a calculation on a set of values and
return a single value. With the exception of COUNT, aggregate
functions ignore null values. Aggregate functions are often used with
the GROUP BY clause of the SELECT statement.
some typical aggregate functions are
SUM
MIN
AVG
...
From MSDN: GROUP BY
Groups a selected set of rows into a set of summary rows by the values
of one or more columns or expressions in SQL Server 2008 R2. One row
is returned for each group. Aggregate functions in the SELECT clause
list provide information about each group instead of
individual rows.
select
userId, name, sum(amount) as amount
from
table
group by
userId, name
You don't want DISTINCT here, but rather GROUP BY with the aggregate SUM()
SELECT
userID,
name,
SUM(amount) AS amount
FROM tbl
GROUP BY userID, name

I DISTINCTly hate MySQL (help building a query)

This is staight forward I believe:
I have a table with 30,000 rows. When I SELECT DISTINCT 'location' FROM myTable it returns 21,000 rows, about what I'd expect, but it only returns that one column.
What I want is to move those to a new table, but the whole row for each match.
My best guess is something like SELECT * from (SELECT DISTINCT 'location' FROM myTable) or something like that, but it says I have a vague syntax error.
Is there a good way to grab the rest of each DISTINCT row and move it to a new table all in one go?
SELECT * FROM myTable GROUP BY `location`
or if you want to move to another table
CREATE TABLE foo AS SELECT * FROM myTable GROUP BY `location`
Distinct means for the entire row returned. So you can simply use
SELECT DISTINCT * FROM myTable GROUP BY 'location'
Using Distinct on a single column doesn't make a lot of sense. Let's say I have the following simple set
-id- -location-
1 store
2 store
3 home
if there were some sort of query that returned all columns, but just distinct on location, which row would be returned? 1 or 2? Should it just pick one at random? Because of this, DISTINCT works for all columns in the result set returned.
Well, first you need to decide what you really want returned.
The problem is that, presumably, for some of the location values in your table there are different values in the other columns even when the location value is the same:
Location OtherCol StillOtherCol
Place1 1 Fred
Place1 89 Fred
Place1 1 Joe
In that case, which of the three rows do you want to select? When you talk about a DISTINCT Location, you're condensing those three rows of different data into a single row, there's no meaning to moving the original rows from the original table into a new table since those original rows no longer exist in your DISTINCT result set. (If all the other columns are always the same for a given Location, your problem is easier: Just SELECT DISTINCT * FROM YourTable).
If you don't care which values come from the other columns you can use a (bad, IMHO) MySQL extension to SQL and do:
SELECT * FROM YourTable GROUP BY Location
which will give a result set with one row per location and values for the other columns derived from the original data in an undefined fashion.
Multiple rows with identical values in all columns don't have any sense. OK - the question might be a way to correct exactly that situation.
Considering this table, with id being the PK:
kram=# select * from foba;
id | no | name
----+----+---------------
2 | 1 | a
3 | 1 | b
4 | 2 | c
5 | 2 | a,b,c,d,e,f,g
you may extract a sample for every single no (:=location) by grouping over that column, and selecting the row with minimum PK (for example):
SELECT * FROM foba WHERE id IN (SELECT min (id) FROM foba GROUP BY no);
id | no | name
----+----+------
2 | 1 | a
4 | 2 | c