Grouping and null values in column - sql

Need some help in how to fix a problem.
Below is my input data. Here I am doing a group by based on name field. The query which I am currently used for grouping is given below.
select name from Table
group by name having count(distinct DOB)='1'
But the problem is that the above query won't fecth records if the DOB field is null for all records within a group.In case if I try to give some dummy value for DOB field, then It won't fetch the result for first two rows and if I didn't give the dummy value for it won't fecth the records in 3 and 4
I tried something like this, but it is wrong
select name from Table
group by name having count(distinct case when DOB is null then '9999-01-01' else DOB END)='1'
Could someone help here with some suggestions. My expected result is given below.

You can replace the logic with:
having min(dob) = max(dob) or
min(dob) is null
Depending on your data, count(distinct) can be relatively expensive, so this can actually be cheaper than using it.
You can use count(distinct). Just change the comparison value:
having count(distinct dob) <= 1

Related

SQL query to return results from one to many table

I'm having difficulties trying to return some data from a poorly structured one to many table.
I've been provided with a data export where everything from 'Section Codes' onwards (in cat_fullxPath) relates to a 'skillID' in my clients database.
The results previously returned on one line but I've used a split function to break these out (from the cat_fullXPath column). You can see the relevant 'skillID' from my clients DB in the far right column:
From here, there are thousands of records that may have a mixture of these skillIDs (and many others, I've just provided this one example). I want to be able to find the records that match all 4 (or however many match from another example) skillIDs and ONLY those.
For example (I just happen to know this ID gives me the results I want):
SELECT
id
skillID
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
AND id = 69580;
This returns me:
Note that these are the only columns in that table.
So this is an ID I'd want to return.
These are results I'd not want to return as one of the skillIDs are missing:
I've created a temp table with a count of all the skills for each ID but I'm not sure if I'm going down the right path at this point
I'm pretty sure that there's a simple solution to this, however I'm hitting my head against the wall. Hope someone can help!
EDIT
This might be a clearer example of when there are different groups of skillIds that I need to align. I've partitioned these off by cat_fullxpath to see if this makes things clearer:
In this screenshot, for example I want to find the ids for everything in table1 where skillID IN (1003914,1005354,1004701) then repeat for (1004659,1004492,1004493,1004701). etc
We know that you need exactly 4 skills, so just make a subquery:
select id from
(
SELECT
id
count(skillID) countSkill
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
group by id;
)
where countSkill = 4;
Could work with sum, instead of count. But instead of filtering by the 4, you filter by 4022352, which is the sum of all skillID.
You can also remove the subquery and use HAVING. But you will obtain worse performance.
SELECT
id
count(skillID) countSkill
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
group by id
having count(skillID) = 4;
You haven't told us your DBMS. Here is a standard SQL approach:
select id
from table1
group by id
having count(case when skillid = 1004464 then 1 end) > 0
and count(case when skillid = 1006543 then 1 end) > 0
and count(case when skillid = 1004605 then 1 end) > 0
and count(case when skillid = 1006740 then 1 end) > 0
and count(case when skillid not in (1004464, 1006543, 1004605, 1006740) then 1 end) = 0;
Another option is to concatenate all skills and see if the resulting skill list matches the desired skill list. In SQL Server the string aggregation function is STRING_AGG.
select id
from table1
group by id
having string_agg(skillid, ',') within group (order by skillid) in
(
'1004464,1004605,1006543,1006740'
);
You can easily extend the IN clause with other combinations or even get the list from another table. Only make sure the skill IDs in the strings are sorted in order to make the strings comparable ('1004464,1004605,1006543,1006740' <> '1006740,1004464,1004605,1006543').

Multiple query data in ms access

I have a table in a accdb that consists of several columns. They include a social security number, several dates and monetary values. I am trying to query data in here ( there are over 600000 results in the accdb ) .
Social security number can appear once or several times in a database. The dates and the monetary values that are on the same line ( in a different column ) can be different, or not.
So let's say my table looks like this:
Ssn Date1 Date2 moneyvalue PostDate
123455 12-01-20 03-04-20 5.21 (A datettime value )
I am trying to do several things:
First I want to only select the ssn that appear at least twice in the database (or more).
From those results i want to only get the ones where date1 is equal to date2.
From those results i want to get the results where there are different values in moneyvalue per ssn.
I want to compare the moneyvalue from the ssn to the money value from the first time this ssn appears in the database ( so the one with the oldest datetime in postDate) and post this ssn if they moneyvalue is different.
Is this possible? How would i go on about this? I have to do this from within ms access sql window, i can't export the database to mssql as it is protected.
So to sum it up:
I want to retrieve all ssn that appear twice or more in the database, where date1 is equal to date2, and where the monetary value in record x does not match the monetary value in the ssn with the oldest postDate.
Your question suggests aggegation and multiple having clauses:
select ssn
from mytable
group by ssn
having
count(*) > 1
and sum(iif(date1 = date2, 1, 0)) > 1
and count(distinct moneyvalue) > 1
Another interpretation is a where clause on condition date1 = date2:
select ssn
from mytable
where date1 = date2
group by ssn
having
count(*) > 1
and count(distinct moneyvalue) > 1
However both queries are not equivalent, and my understanding is that the first one is what you asked for.

Determine the number of times a null value occurs in column B for a distinct value in column A, SQL table

I have a SQL table with "name" as one column, date as another, and location as a third. The location column supports null values.
I am trying to write a query to determine the number of times a null value occurs in the location column for each distinct value in the name column.
Can someone please assist?
One method uses conditional aggregation:
select name, sum(case when location is null then 1 else 0 end)
from t
group by name;
Another method that involves slightly less typing is:
select name, count(*) - count(location)
from t
group by name;
use count along with filters, as you only requires Null occurrence
select name, count(*) occurances
from mytable
where location is null
group by name
From your question, you'll want to get a distinct list of all different 'name' rows, and then you would like a count of how many NULLs there are per each name.
The following will achieve this:
SELECT name, count(*) as null_counts
FROM table
WHERE location IS NULL
GROUP BY name
The WHERE clause will only retrieve records where the records have NULL as their location.
The GROUP BY will pivot the data based on NAME.
The SELECT will give you the name, and the COUNT(*) of the number of records, per name.

Unable to retrieve NULL data

I have three fields Category, Date, and ID. I need to retrieve data that does not belong under certain ID. Here is an example of my query:
SELECT Category, Date, ID
FROM table
WHERE ID NOT IN('1','2','3')
AND Date = '01/06/2015'
After running this query I should only get records that do not have any ID meaning NULL values because for yesterday's record only ID 1,2,3 exist and rest do not have any value (NULL). For some reason when I run the query it takes away the NULL values as well so I end up with 0 rows. This is very stranger to me and I do not understand what is the cause. All I know that the ID numbers are string values. Any suggestions?
Try this. NULL values cannot not be equated to anything else.
SELECT Category, Date, ID
FROM table
WHERE (ID NOT IN('1','2','3') OR ID IS NULL)
AND Date = '01/06/2015'
Others have already shown how to fix this, so let me try to explain why this happens.
WHERE ID NOT IN('1','2','3')
is equivalent to
WHERE ID <> '1' AND ID <> '2' AND ID <> '3'
Since NULL <> anything yields UNKNOWN, your expression yields UNKNOWN and the record in question is not returned.
See the following Wikipedia article for details on this ternary logic:
Null (SQL): Comparisons with NULL and the three-valued logic (3VL)
Take a look at NULL comparison search conditions.
Use the IS NULL or IS NOT NULL clauses to test for a NULL value. This
can add complexity to the WHERE clause. For example, the TerritoryID
column in the AdventureWorks2008R2 Customer table allows null values.
If a SELECT statement is to test for null values in addition to
others, it must include an IS NULL clause:
SELECT CustomerID, AccountNumber, TerritoryID
FROM AdventureWorks2008R2.Sales.Customer
WHERE TerritoryID IN (1, 2, 3)
OR TerritoryID IS NULL
If you really want to be able to compare values to NULL's directly, you can do that as well. This is also described in the above article:
Transact-SQL supports an extension that allows for the comparison
operators to return TRUE or FALSE when comparing against null values.
This option is activated by setting ANSI_NULLS OFF.
Are you sure about you want ID fields as null?
Here is how you do it: (Assumins rest of your query is ok)
SELECT Category, Date, ID
FROM table
WHERE ID IS NULL
AND Date = '01/06/2015'
If you want records that does not have a category than you need to change your query as
SELECT Category, Date, ID
FROM table
WHERE Category IS NULL
AND Date = '01/06/2015'
You got a couple of options:
SELECT Category, Date, ID
FROM table
WHERE ISNULL(ID, '4') NOT IN('1','2','3')
AND Date = '01/06/2015'
Or what su8898 said
Please note that when you use "IN" or "NOT IN" which will not fetch any values if the column has got NULL values..
In your case, if you want to fetch only records with ID=NULL, then you can try the solution vgSefa suggested above..
If you want to pull all records with NULL as well as ID NOT IN('1','2','3'), then you could try something like this..
SELECT Category, Date, ID
FROM table
WHERE ID IS NULL
AND Date = '01/06/2015'
UNION ALL
SELECT Category, Date, ID
FROM table
WHERE ID NOT IN('1','2','3')
AND ID IS NOT NULL
AND Date = '01/06/2015'
Try this:
SELECT Category, Date, ID
FROM table
WHERE ID N
AND Date = '01/06/2015'

How do I check if all posts from a joined table has the same value in a column?

I'm building a BI report for a client where there is a 1-n related join involved.
The joined table has a field for employee ID (EmplId).
The query that I've built for this report is supposed to give a 1 in its field "OneEmployee" if all the related posts have the same employee in the EmplId field, null if it's different employees, i.e:
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'John'
This should give a 1 in the said field in the query
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'George'
This should leave the said field blank
The idea is to create a field where a case function checks this and returns the correct value. But my problem is whereas there is a way to check for this through SQL.
select not count(*) from your_table
where employee_id = GIVEN_ID
and your_field not in ( select min(your_field)
from your_table
where employee_id = GIVEN_ID);
Note: my first idea was to use LIMIT 1 in the inner query, but MYSQL didn't like it, so min it was - the points to use any, but only one. Min should work, but the field should be indexed, then this query will actually execute rather fast, as only indexes would be used (obviously employee_id should also be indexed).
Note2: Do not get too confused with not in front of count(*), you want 1 when there is none that is different, I count different ones, and then give you the not count(*), which will be one if count is 0, otherwise 0.
Seems a job for a window COUNT():
SELECT
…,
CASE COUNT(DISTINCT TaskTransHours.EmplId) OVER () WHEN 1 THEN 1 END
AS OneEmployee
FROM …