How to find duplicates in a table using Access SQL? - sql

I try to use an SQL query in Access but it doesn't work. Why?
SELECT * FROM table
EXCEPT
SELECT DISTINCT name FROM table;
I have a syntax error in FROM statement.

MS Access does not support EXCEPT keyword. You can try using the LEFT JOIN like this:
select t1.* from table t1 left join table t2 on t1.name = t2.name
EDIT:
If you want to find the duplicates in your table then you can try this:
SELECT name, COUNT(*)
FROM table
GROUP BY name
HAVING COUNT(*) > 1
You can also refer: Create a Query in Microsoft Access to Find Duplicate Entries in a Table and follow the steps to find the duplicates in your table.
First open the MDB (Microsoft Database) containing the table you want
to check for duplicates. Click on the Queries tab and New.
This will open the New Query dialog box. Highlight Find Duplicates
Query Wizard then click OK.
Now highlight the table you want to check for duplicate data. You can
also choose Queries or both Tables and Queries. I have never seen a
use for searching Queries … but perhaps it would come in handy for
another’s situation. Once you’ve highlighted the appropriate table
click Next.
Here we will choose the field or fields within the table we want to
check for duplicate data. Try to avoid generalized fields.
Name the Query and hit Finish. The Query will run right away and pop
up the results. Also the Query is saved in the Queries section of
Access.
Depending upon the selected tables and fields your results will look
something similar to the shots below which show I have nothing
duplicated in the first shot and the results of duplicates in the
other.

use HAVING COUNT(name) > 1 clause
SELECT * FROM Table1
WHERE [name] IN
(SELECT name, Count(name)
FROM Table1
GROUP BY name
HAVING COUNT(name)>1)

You can use LEFT JOIN or EXISTS
LEFT JOIN
SELECT DISTINCT t1.NAME FROM table1 as t1
LEFT JOIN table2 as t2 on t1.name=t2.name
WHERE t2.name is null
;
NOT EXITS
SELECT T1.NAME FROM table1 as t1 where not exists
(SELECT T2.NAME FROM table2 as t2 where t1.name=t2.name)

Whether Access supports except or not is one issue. The other is that you are not using it properly. You have select * above the word except and select name below. That is not valid sql. If you tried that in SQL Server, your error message would be All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.

Related

BigQuery how to automatically handle "duplicate column names" on left join

I am working with a dataset of tables that (a) often requires joining tables together, however also (b) frequently has duplicate columns names. Any time I write a query along the lines of:
SELECT
t1.*, t2.*
FROM t1
LEFT JOIN t2 ON t1.this_id = t2.matching_id
...I get the error Duplicate column names in the result are not supported. Found duplicate(s): this_col, that_col, another_col, more_cols, dupe_col, get_the_idea_col
I understand that with BigQuery, it is better to avoid using * when selecting tables, however my data tables aren't too big + my bigquery budget is high, and doing these joins with all columns helps significantly with data exploration.
Is there anyway BigQuery can automatically handle / rename columns in these situations (e.g. prefix the column with the table name), as opposed to not allowing the query all together?
Thanks!
The simplest way is to select records rather than columns:
SELECT t1, t2
FROM t1 LEFT JOIN
t2
ON t1.this_id = t2.matching_id;
This is pretty much what I do for ad hoc queries.
If you want the results as columns and not records (they don't look much different in the results), you can use EXCEPT:
SELECT t1.* EXCEPT (duplicate_column_name),
t2.* EXCEPT (duplicate_column_name),
t1.duplicate_column_name as t1_duplicate_column_name,
t2.duplicate_column_name as t2_duplicate_column_name
FROM t1 LEFT JOIN
t2
ON t1.this_id = t2.matching_id;
Is there anyway BigQuery can automatically handle / rename columns in these situations (e.g. prefix the column with the table name), as opposed to not allowing the query all together?
This is possible with BigQuery Legacy SQL - which can be handy for data exploration unless you are dealing with data types or using some functions/features specific to standard sql
So below
#legacySQL
SELECT t1.*, t2.*
FROM table1 AS t1
LEFT JOIN table2 AS t2
ON t1.this_id = t2.matching_id
will produce output where all column names will be prefixed with respective alias like t1_this_id and t2_matching_id

SQL Query - Select Value from T1 where second value fully met in T2

I can do this in an ugly stored procedure with temp tables and whatnot, but I know an experienced developer could do this SO much more elegantly than what I've come up with. In fact, I'd kind of rather not have to call the sproc at all, but just have one query that gives me what I need.
I'm working with two tables:
T1 BillingDirectivesNeeded
T2 BillingDirectives.
T1 Has two fields relevant to this task -
PKey
WBS1.
There will be many PKeys associated with each WBS1.
T2 has only one field of interest
PKey.
The task I'm trying to address is geting a list of WBS1s from T1 that have ALL of their needed directives in T2 before I enable their import.
We want to import a WBS1 ONLY when all of the PKeys for that WBS1 are found in T2. If not, I'll just leave them grayed out.
I've tried a dozen different ways to get this to happen over the last few hours, and I seem to have a mental block. The pseudo-code would look something like this:
select T1.WBS1 from BillingDirectiveNeeded T1
where [all the T1.PKeys for T1.WBS1 can be found in BillingDirectives T2]
You can try using a Where Exists clause:
Select T1.WBS1
From BillingDirectiveNeeded T1
Where Exists
(
Select 1
From BillingDirectives T2
Where T2.PKey = T1.PKey
)
select DISTINCT T1.WBS1 from BillingDirectiveNeeded T1 where T1.PKey in (SELECT T2.PKey FROM BillingDirectives T2)

ORA-01446 occurs if I try to select random rows using SAMPLE clause in Oracle

Since the number of rows in the table is too large I switched from "ORDER BY dbms_random.value" construction for getting 1000 random rows to SAMPLE clause. It takes less than a second instead of 3 minutes to complete. But on some tables I get this error
ORA-01446: cannot select ROWID from view with DISTINCT, GROUP BY, etc
My query looks like this:
SELECT t1.columnA FROM
(SELECT columnA FROM table1 sample(1) where rownum <= 1000) t1
JOIN table2 t2
ON (t1.columnA = t2.columnA)
WHERE t2.columnB IS NOT NULL
and it works fine on some tables, but fails on others. I gave up googling, could you please advise any workaround in my situation.
As I expected SAMPLE clause works faster than all other solutions (Here you can see some of them)
Because I'm new to Oracle DBs generally and Oracle SQL Developer in particaular I mistakenly called view a "table". After I found that out the solution was clear.
SOLUTION: I had to look at the SQL query that forms a view and replace view name with that query.
For example my table1 was actually a view whose name I replaced with SELECT query that forms that view:
SELECT t1.columnA FROM
(SELECT columnA FROM (select distinct tt1.columnA, tt2.columnC
from table22 tt2, table11 tt1
where tt2.columnC = tt1.columnA) sample(1) where rownum <= 1000) t1
JOIN table2 t2
ON (t1.columnA = t2.columnA)
WHERE t2.columnB IS NOT NULL
After that I could work with tables and apply SAMPLE to them! Thank you everybody, great website! =)
PS: sorry for my English and ugly code facepalm.jpg

Problem with sql query

I'm using MySQL and I'm trying to construct a query to do the following:
I have:
Table1 [ID,...]
Table2 [ID, tID, start_date, end_date,...]
What I want from my query is:
Select all entires from Table2 Where Table1.ID=Table2.tID
**where at least one** end_date<today.
The way I have it working right now is that if Table 2 contains (for example) 5 entries but only 1 of them is end_date< today then that's the only entry that will be returned, whereas I would like to have the other (expired) ones returned as well. I have the actual query and all the joins working well, I just can't figure out the ** part of it.
Any help would be great!
Thank you!
SELECT * FROM Table2
WHERE tID IN
(SELECT Table2.tID FROM Table1
INNER JOIN Table2 ON Table1.ID = Table2.tID
WHERE Table2.end_date < NOW
)
The subquery will select all tId's that match your where clause. The main query will use this subquery to filter the entries in table 2.
Note: the use of inner join will filter all rows from table 1 with no matching entry in table 2. This is no problem; these entries wouldn't have matched the where clause anyway.
Maybe, just maybe, you could create a sub-query to join with your actual tables and in this subquery you use a count() which can be used later on you where clause.

What would be the best way to write this query

I have a table in my database that has 1.1MM records. I have another table in my database that has about 2000 records under the field name, "NAME". What I want to do is do a search from Table 1 using the smaller table and pull the records where they match the smaller tables record. For example Table 1 has First Name, Last Name. Table 2 has Name, I want to find every record in Table 1 that contains any of Table 2 Names in either the first name field or the second name field. I tried just making an access query but my computer just froze. Any thoughts would be appreaciated.
have you considered the following:
Select Table1.FirstName, Table1.LastName
from Table1
where EXISTS(Select * from Table2 WHERE Name = Table1.FirstName)
or EXISTS(Select * from Table2 WHERE Name = Table1.LastName)
I have found before that on large tables this might work better than an inner join.
Be sure to create indexes on Table1.first_name, Table1.last_name, and Table2.name. They will dramatically speed up your query.
Edit: For Microsoft Access 2007, see CREATE INDEX.
See above previous notes about indexes, but I believe from your description, you want something like:
select table1.* from table1
inner join
table2 on (table1.first_name = table2.name OR table1.last_name = table2.name);
It should go something like this,
Select Table1.FirstName, Table1.LastName
from Table1
where Table1.FirstName IN (Select Distinct Name from Table2)
or Table1.LastName IN (Select Distinct Name from Table2)
And there are various other ways to run this same query, i would suggest you see execution plan for each of these queries to find out which one is the fastest. In addition creating indexes on the column which is used in a "where" condition will also speed up the query.
i agree with astander. based on my experience, using EXIST instead of IN is a lot faster.