MS Access SQL with self join produces less results than original table, what does the set of records that aren't present in query represent? - sql

My original table (File05292019) has 22,904 records. I perform a self join on 3 of the fields as shown below and the result is 22,886. Why is this the case? What do the missing records represent?
SELECT File05292019.LastName, File05292019.FirstName, File05292019.SubscriberSocialSecurityNumber
FROM File05292019
INNER JOIN File05292019 AS File05292019_1
ON (File05292019.SubscriberSocialSecurityNumber = File05292019_1.SubscriberSocialSecurityNumber)
AND (File05292019.LastName = File05292019_1.LastName)
AND (File05292019.FirstName = File05292019_1.FirstName)
GROUP BY File05292019.LastName, File05292019.FirstName, File05292019.SubscriberSocialSecurityNumber;

Because of group operator. You should have duplicate records in result set
Check by running this query
SELECT File05292019.LastName, File05292019.FirstName, File05292019.SubscriberSocialSecurityNumber
FROM File05292019
INNER JOIN File05292019 AS File05292019_1
ON (File05292019.SubscriberSocialSecurityNumber = File05292019_1.SubscriberSocialSecurityNumber)
AND (File05292019.LastName = File05292019_1.LastName)
AND (File05292019.FirstName = File05292019_1.FirstName)

the presence of group by suggest that
this mean that you have some rows with the same values
you could try uisng
SELECT File05292019.LastName
, File05292019.FirstName
, File05292019.SubscriberSocialSecurityNumber
count(*)
FROM File05292019
GROUP BY File05292019.LastName
, File05292019.FirstName
, File05292019.SubscriberSocialSecurityNumber
HAVING count(*) > 1
for find these rows

Couple of possibilities:
NULL values exist in the JOIN fields: SubscriberSocialSecurityNumber, LastName, and FirstName. Because NULL = NULL is a False statement, joins exclude nulls (non-value entities).
Duplicate values in GROUP BY fields where the aggregation returns distinct values by grouping. Add the COUNT(*) As RecordCount aggregate to see which fields have more than 1 value.
Possibly subscribers changed their names but retained same SSNs; names and SSNs were incorrectly inputted; or several records use a default status like 999-99-9999?

Related

Find if a string is in or not in a database

I have a list of IDs
'ACE', 'ACD', 'IDs', 'IN','CD'
I also have a table similar to following structure
ID value
ACE 2
CED 3
ACD 4
IN 4
IN 4
I want a SQL query that returns a list of IDs that exists in the database and a list of IDs that does not in the database.
The return should be:
1.ACE, ACD, IN (exist)
2.IDs,CD (not exist)
my code is like this
select
ID,
value
from db
where ID is in ( 'ACE', 'ACD', 'IDs', 'IN','CD')
however, the return is 1) super slow with all kinds of IDs 2) return multiple rows with the same ID. Is there anyway using postgresql to return 1) unique ID 2) make the running faster?
Assuming no duplicates in table nor input, this query should do it:
SELECT t.id IS NOT NULL AS id_exists
, array_agg(ids.id)
FROM unnest(ARRAY['ACE','ACD','IDs','IN','CD']) ids(id)
LEFT JOIN tbl t USING (id)
GROUP BY 1;
Else, please define how to deal with duplicates on either side.
If the LEFT JOIN finds a matching row, the expression t.id IS NOT NULL is true. Else it's false. GROUP BY 1 groups by this expression (1st in the SELECT list), array_agg() forms arrays for each of the two groups.
Related:
Select rows which are not present in other table
Hmmm . . . Is this sufficient:
select ids.id,
(exists (select 1 from table t where t.id = ids.id)
from unnest(array['ACE', 'ACD', 'IDs', 'IN','CD']) ids(id);

SQL Server ISNULL multiple columns

I have the following query which works great but how do I add multiple columns in its select statement? Following is the query:
SELECT ISNULL(
(SELECT DISTINCT a.DatasourceID
FROM [Table1] a
WHERE a.DatasourceID = 5 AND a.AgencyID = 4 AND a.AccountingMonth = 201907), NULL) TEST
So currently I only get one column (TEST) but would like to add other columns such as DataSourceID, AgencyID and AccountingMonth.
If you want to output a row for some condition (or requested values ) and output a row when it does not meet condition,
you can set a pseudo table for your requested values in the FROM clause and make a left outer join with your Table1.
SELECT ISNULL(Table1.DatasourceId, 999999),
Table1.AgencyId,
Table1.AccountingMonth,
COUNT(*) as count
FROM ( VALUES (5, 4, 201907 ),
(6, 4, 201907 ))
AS requested(DatasourceId, AgencyId, AccountingMonth)
LEFT OUTER JOIN Table1 ON requested.agencyid=Table1.AgencyId
AND requested.datasourceid = Table1.DatasourceId
AND requested.AccountingMonth = Table1.AccountingMonth
GROUP BY Table1.DatasourceId, Table1.AgencyId, Table1.AccountingMonth
Note that:
I have put a ISNULL for the first column like you did to output a particular value (9999) when no value is found.
I did not put the ISNULL(...,NULL) like your query in the other columns since IMHO it is not necessary: if there is no value, a null will be output anyway.
I added a COUNT(*) column to illustrate an aggregate, you could use another (SUM, MIN, MAX) or none if you do not need it.
The set of requested values is provided as a constant table values (see https://learn.microsoft.com/en-us/sql/t-sql/queries/table-value-constructor-transact-sql?view=sql-server-2017)
I have added multiple rows for requested conditions : you can request for multiple datasources, agencies or months in one query with one line for each in the output.
If you want only one row, put only one row in "requested" pseudo table values.
There must be a GROUP BY, even if you do not want to use an aggregate (count, sum or other) in order to have the same behavior as your distinct clause , it restricts the output to single lines for requested values.
To me it seems that you want to see does data exists, i guess that your's AgencyID is foreign key to agency table, DataSourceID also to DataSource, and that you have AccountingMonth table which has all accounting periods:
SELECT ds.ID as DataSourceID , ag.ID as AgencyID , am.ID as AccountingMonth ,
ISNULL(COUNT(a.*),0) as Count
FROM [Table1] a
RIGHT JOIN [Datasource] ds ON ds.ID = a.DataSourceID
RIGHT JOIN [Agency] ag ON ag.ID = a.AgencyID
RIGHT JOIN [AccountingMonth] am on am.ID = a.AccountingMonth
GROUP BY ds.ID, ag.ID, am.ID
In this way you can see count of records per group by criteria. Notice RIGHT join, you must use RIGHT JOIN if you want to include all record from "Right" table.
In yours query you have DISTINCT a.DatasourceID and WHERE a.DatasourceID = 5 and it returns 5 if in table exists rows that match yours WHERE criteria, and returns null if there is no data. If you remove WHERE a.DatasourceID = 5 your query would break with error: subquery returned multiple rows.
the way you are doing only allows for one column and one record and giving it the name of test. It does not look like you really need to test for null. because you are returning null so that does nothing to help you. Remove all the null testing and return a full recordset distinct will also limit your returns to 1 record. When working with a single table you don't need an alias, if there are no spaces or keywords braced identifiers not required. if you need to see if you have an empty record set, test for it in the calling program.
SELECT DatasourceID, AgencyID,AccountingMonth
FROM Table1
WHERE DatasourceID = 5 AND AgencyID = 4 AND AccountingMonth = 201907

Select one table column with distinct record with other table all data

I have 2 table 'userfoodcategory' and 'MenuMaster'.
'userfoodcategory' has the foodcategory and 'MenuMaster' has multiple items along with this category with a column 'isnonveg'.
I want to query 'userfoodcategory' table data with 1 addition column 'isnonveg', this column is in 'MenuMaster' table.
I am trying below query but it is giving redundant record
SELECT DISTINCT ufc.*, MM.isnonveg
FROM MenuMaster MM
LEFT JOIN userfoodcategory ufc ON MM.categoryid = ufc.foodcategoryid
WHERE ufc.USERID = 19 --and MM.isnonveg IS NULL
order by ufc.foodcategoryid
For more details please have a look on below screen shots.
Also I want this as a linq query, but first I was trying to build it in sql and after that I need to convert it in linq as I am new in linq.
Thanks in advance.
You can try to use below Query:
SELECT DISTINCT ufc.*, MM.isnonveg
FROM (select distinct categoryid,isnonveg FROM MenuMaster) MM
LEFT JOIN userfoodcategory ufc ON MM.categoryid = ufc.foodcategoryid
WHERE ufc.USERID = 19 --and MM.isnonveg IS NULL
order by ufc.foodcategoryid

Subquery returns more than one row and stops working

I have a table which has got a column. I have to fetch the values from the column and modify it in the view before comparing it with an externally supplied value.
For that I am using the following query:
SELECT COUNT(*)
FROM tblMaster WITH (NOLOCK)
WHERE (SELECT test
FROM
(SELECT RIGHT('00000000000000000' + RTRIM(CODE), 17) as test
FROM tblMaster ) t) = '00001231231231231'
Subquery returns modified values of the column extracted from the actual table in form of a column. So I am using the column returned out of the subquery. I don't know if I can use a subquery which returns a column on the left side of equality.
Subquery returns multiple values.
You don't need a subquery for this:
SELECT COUNT(*)
FROM tblMaster WITH (NOLOCK)
WHERE right('00000000000000000' + RTRIM(CODE), 17) = '00001231231231231';
You can simplify this. You don't need to add 0s to integers, SQL will trim them off when comparing. Also you don't need the subquery:
SELECT COUNT(*)
FROM tblMaster WITH (NOLOCK)
WHERE CODE = '1231231231231'

Use of CASE statement values in THEN expression

I am attempting to use a case statement but keep getting errors. Here's the statement:
select TABLE1.acct,
CASE
WHEN TABLE1.acct_id in (select acct_id
from TABLE2
group by acct_id
having count(*) = 1 ) THEN
(select name
from TABLE3
where TABLE1.acct_id = TABLE3.acct_id)
ELSE 'All Others'
END as Name
from TABLE1
When I replace the TABLE1.acct_id in the THEN expression with a literal value, the query works. When I try to use TABLE1.acct_id from the WHEN part of the query, I get a error saying the result is more than one row. It seems like the THEN expression is ignoring the single value that the WHEN statement was using. No idea, maybe this isn't even a valid use of the CASE statement.
I am trying to see names for accounts that have one entry in TABLE2.
Any ideas would be appreciated, I'm kind of new at SQL.
First, you are missing a comma after TABLE1.acct. Second, you have aliased TABLE1 as acct, so you should use that.
Select acct.acct
, Case
When acct.acct_id in ( Select acct_id
From TABLE2
Group By acct_id
Having Count(*) = 1 )
Then ( Select name
From TABLE3
Where acct.acct_id = TABLE3.acct_id
Fetch First 1 Rows Only)
Else 'All Others'
End as Name
From TABLE1 As acct
As others have said, you should adjust your THEN clause to ensure that only one value is returned. You can do that by add Fetch First 1 Rows Only to your subquery.
Then ( Select name
From TABLE3
Where acct.acct_id = TABLE3.acct_id
Fetch First 1 Rows Only)
Fetch is not accepting in CASE statement - "Keyword FETCH not expected. Valid tokens: ) UNION EXCEPT. "
select name from TABLE3 where TABLE1.acct_id = TABLE3.acct_id
will give you all the names in Table3, which have a accompanying row in Table 1. The row selected from Table2 in the previous line doesn't enter into it.
Must be getting more than one value.
You can replace the body with...
(select count(name) from TABLE3 where TABLE1.acct_id = TABLE3.acct_id)
... to narrow down which rows are returning multiples.
It may be the case that you just need a DISTINCT or a TOP 1 to reduce your result set.
Good luck!
I think that what is happening here is that your case must return a single value because it will be the value for the "name" column. The subquery (select acct_id from TABLE2 group by acct_id having count(*) = 1 ) is OK because it will only ever return one value. (select name from TABLE3 where TABLE1.acct_id= TABLE3.acct_id) could return multiple values depending on your data. The problem is you trying to shove multiple values into a single field for a single row.
The next thing to do would be to find out what data causes multiple rows to be returned by (select name from TABLE3 where TABLE1.acct_id= TABLE3.acct_id), and see if you can further limit this query to only return one row. If need be, you could even try something like ...AND ROWNUM = 1 (for Oracle - other DBs have similar ways of limiting rows returned).