How to use DISTINCT in nested personal geodatabse SQL in ArcGIS - sql

I have this Statement used in ArcGIS for a personal geodatabse.
It selects the top three records in [MyColumn1], but not if the [MyColumn2] equals an inline variable.
[MyColumn1] in(SELECT TOP 3 ( [MyColumn1] )
FROM MyTable
WHERE [MyColumn2] <> %Variable%
ORDER BY [MyColumn1] DESC)
But I also need to add a DISTINCT function because some times there are repeated values in [MyColumn1] so that 4 records are selected.
How to include DISTINCT in this expression so that ArcCrash and a personal geodatabase can handle it? There is a lot on this subject, but nothing specific to working Arc or at least access.
This doesn't work
[MyColumn1] in(SELECT TOP 3 ( [MyColumn1] )
FROM MyTable
WHERE [MyColumn2] <> %Variable%
ORDER BY [MyColumn1] DESC
DISTINCT [MyColumn1] )
Nor does this
[MyColumn1] in(SELECT TOP 3 ( [MyColumn1] )
FROM MyTable
WHERE [MyColumn2] <> %Variable%
ORDER BY [MyColumn1] DESC
GROUP BY [MyColumn1])

Well since this isn't the full query your select top 3 is going to return 3 records, but if your MyColumn1 has duplicates and the select distinct returns one of those that has duplicate then both of those will be included in the In statement. For Example.
FirstName | LastName
Bob doe
Billy smith
Marie Evans
Bob Lock
if you then do
Select * from table where FirstName in (SELECT DISTINCT TOP 3 FirstName FROM table)
you will get all 4 records because your inner query will return ("Bob", "Billy", "Marie") and since your initial query (Select * from table where FirstName in ("Bob", "Billy", "Marie")) wants records that have those 3 first names, well they all have at least one of those names. the solution here would be this query.
Select Distinct FirstName from table where FirstName in (SELECT DISTINCT TOP 3 FirstName FROM table)
The problem with this is when you do a distinct you need to include all the column names in it so you will cannot get the last name column from this query.

I used ArcGIS tool Delete Identical which deletes the first of multiple, identical records. HA. quite perfect.

Related

How to query how many different IDs use the same column value?

I have this homework assignment where I'm attempting to query a table to find the id numbers that are all using the same column value, let's say last name in this case. I'd like to find the ids that use the same last name more than once, and have a column that tells me the total number of unique IDs that used that same last name.
SELECT id, COUNT(*) as ID_count
FROM [table]
WHERE l_name IN
(
SELECT l_name
FROM [table]
GROUP BY l_name HAVING COUNT(*)>1
)
GROUP BY id;
This is what I have so far. It grants me the ID number, but the count(*) is not what I'm going for. What I'm instead trying to get is how many unique IDs have "Smith" as their last name, instead of all the occurrences of one specific ID that has used "Smith".
I've tried different things but I feel like I'm at a roadblock. Any hints or tips are nice; I don't need this problem solved 100%, but I feel as if I can't past the idea of using count(*).
Thanks all.
It sounds like you were already there WITHIN the inner query. Just add the count to it for the output.
SELECT
t1.id,
t1.l_name,
max( PQ.UniqCount ) UniqCount,
COUNT(*) as countForThisSingleID
FROM
[table] t1
JOIN
( SELECT
t.l_name,
COUNT( DISTINCT t.ID ) as UniqCount
FROM
[table] t
GROUP BY
t.l_name
HAVING
COUNT( DISTINCT t.ID ) > 1 ) PQ
on t1.l_name = PQ.l_name
group by
t1.id,
t1.l_name
order by
t1.l_name,
t1.id
So by doing a COUNT( DISTINCT ) on the inner pre-query (alias PQ), for each L_Name, you are getting a count of distinct IDs. I dont know if your [table] has multiple entries for the same ID in it or not, so applying the DISTINCT. Same for the HAVING clause. But at least now the inner pre-query gets the overall distinct counts for a given L_Name value.
Now, doing a JOIN to the outer table on that L_Name will get the corresponding count in the result query, along with showing the l_name that it qualified against. So if you have a table with 18 DISTINCT ID instances of John, 37 of Karen, 11 of Mike, your inner query will get those. Now joined to the outer, you will get the output of EACH instance of John and their corresponding IDs, then all Karen instance and Mike instances.
The count for the outer query is getting the count of the one ID (and name) times that it appears in the table. So if the table had ID = 5, L_Name = John and ID 5 appeared 3 times in the table, the output of his record might look like
ID L_Name countForThisSingleID UniqCount
5 John 3 18
72 John 8 18
127 John 2 18
etc...
Similarly the output would include all Karen's and Mike's within the table (and any others that qualify).
Again, without knowing if your [table] is a unique instance per ID such as a master customer lookup table where it would only appear once vs an order table where the ID may appear more than once for a single person's ID, not positive what your final answer is looking for.
But I think I have given you a bunch to chew on and run with.

SQL Server 2008 select query difficulty

I have a table with over 100k records. Here my issue, I have a bunch of columns
CompanyID CompanyName CompanyServiceID ServiceTypeID Active
----------------------------------------------------------------
1 Xerox 17 33 Yes
2 Microsoft 19 39 Yes
3 Oracle 22 54 Yes
2 Microsoft 19 36 Yes
So here's how my table looks, it has about 30 other columns but they are irrelevant for this question.
Here's my quandary..I'm trying to select all records where CompanyID and CompanyServiceID are the same, so basically as you can see in the table above, I have Microsoft that appears twice in the table, and has the same CompanyID and CompanyServiceID, but different ServiceTypeID.
I need to be able to search all records where there are duplicates. The person maintaining this data was very messy and did not update some of the columns properly so I have to go through all the records and find where there are records that have the same CompanyID and CompanyServiceID.
Is there a generic query that would be able to do that?
None of these columns are my primary key, I have a column with record number that increments by 1.
You can try something like this:
SELECT CompanyName, COUNT(CompanyServiceID)
FROM //table name here
GROUP BY CompanyName
HAVING ( COUNT(CompanyServiceID) > 1 )
This will return a grouped list of all companies with multiple entries. You can modify what columns you want in the SELECT statement if you need other info from the record as well.
Here's one option using row_number to create the groupings of duplicated data:
select *
from (
select *,
row_number () over (partition by companyId, companyserviceid
order by servicetypeid) rn
from yourtable
) t
where rn > 1
Another option GROUP BY, HAVING and INNER JOIN
SELECT
*
FROM
Tbl A INNER JOIN
(
SELECT
CompanyID,
CompanyServiceID
FROM
Tbl
GROUP BY
CompanyID,
CompanyServiceID
HAVING COUNT(1) > 1
) B ON A.CompanyID = B.CompanyID AND
A.CompanyServiceID = B.CompanyServiceID
Using Join..
Select *
from
Yourtable t1
join
(
select companyid,companyserviceid,count(*)
from
Yourtable
having count(*)>1)b
on b.companyid=t1.companyid
and b.companyserviceid=t1.companyserviceid

Check if tables are identical using SQL in Oracle

I was asked this question during an interview for a Junior Oracle Developer position, the interviewer admitted it was a tough one:
Write a query/queries to check if the table 'employees_hist' is an exact copy of the table 'employees'. Any ideas how to go about this?
EDIT: Consider that tables can have duplicate records so a simple MINUS will not work in this case.
EXAMPLE
EMPLOYEES
NAME
--------
Jack Crack
Jack Crack
Jill Hill
These two would not be identical.
EMPLOYEES_HIST
NAME
--------
Jack Crack
Jill Hill
Jill Hill
If the tables have the same columns, you can use this; this will return no rows if the rows in both tables are identical:
(
select * from test_data_01
minus
select * from test_data_02
)
union
(
select * from test_data_02
minus
select * from test_data_01
);
Identical regarding what? Metadata or the actual table data too?
Anyway, use MINUS.
select * from table_1
MINUS
select * from table_2
So, if the two tables are really identical, i.e. the metadata and the actual data, it would return no rows. Else, it would prove that the data is different.
If, you receive an error, it would mean the metadata itself is different.
Update If the data is not same, and that one of the table has duplicates.
Just select the unique records from one of the table, and simply apply MINUS against the other table.
One possible solution, which caters for duplicates, is to create a subquery which does a UNION on the two tables, and includes the number of duplicates contained within each table by grouping on all the columns. The outer query can then group on all the columns, including the row count column. If the table match, there should be no rows returned:
create table employees (name varchar2(100));
create table employees_hist (name varchar2(100));
insert into employees values ('Jack Crack');
insert into employees values ('Jack Crack');
insert into employees values ('Jill Hill');
insert into employees_hist values ('Jack Crack');
insert into employees_hist values ('Jill Hill');
insert into employees_hist values ('Jill Hill');
with both_tables as
(select name, count(*) as row_count
from employees
group by name
union all
select name, count(*) as row_count
from employees_hist
group by name)
select name, row_count from both_tables
group by name, row_count having count(*) <> 2;
gives you:
Name Row_count
Jack Crack 1
Jack Crack 2
Jill Hill 1
Jill Hill 2
This tells you that both names appear once in one table and twice in the other, and therefore the tables don't match.
select name, count(*) n from EMPLOYEES group by name
minus
select name, count(*) n from EMPLOYEES_HIST group by name
union all (
select name, count(*) n from EMPLOYEES_HIST group by name
minus
select name, count(*) n from EMPLOYEES group by name)
You could merge the two tables and then subtract one of the tables from the result. If the result of the subtraction is an empty table then you know that the the tables must be the same since merge had no effect (every row and column were effectively the same)
How do I merge two tables with different column number while removing duplicates?
That link provides a good way to merge the two tables without duplicates without knowing what the columns are.
Ensure the rows are unique by adding a pseudo column
WITH t1 AS
(SELECT <All_Columns>
, row_number() OVER
(PARTITION BY <All_Columns>
ORDER BY <All_Columns>) row_num
FROM employees)
, t2 AS
(SELECT <All_Columns>
, row_number() OVER
(PARTITION BY <All_Columns>
ORDER BY <All_Columns>) row_num
FROM employees_hist)
(SELECT *
FROM t1
MINUS
SELECT *
FROM t2
UNION ALL
(SELECT *
FROM t1
MINUS
SELECT *
FROM t2)
Use row_number to make sure there are no duplicate rows. Now you can use minus and if there are no results, the tables are identical.
SELECT ROW_NUMBER() OVER (Order By Name), *
FROM tab1
MINUS
SELECT ROW_NUMBER() OVER (Order By Name), *
FROM tab2

SQL Separating Distinct Values using single column

Does anyone happen to know a way of basically taking the 'Distinct' command but only using it on a single column. For lack of example, something similar to this:
Select (Distinct ID), Name, Term from Table
So it would get rid of row with duplicate ID's but still use the other column information. I would use distinct on the full query but the rows are all different due to certain columns data set. And I would need to output only the top most term between the two duplicates:
ID Name Term
1 Suzy A
1 Suzy B
2 John A
2 John B
3 Pete A
4 Carl A
5 Sally B
Any suggestions would be helpful.
select t.Id, t.Name, t.Term
from (select distinct ID from Table order by id, term) t
You can use row number for this
Select ID, Name, Term from(
Select ID, Name, Term, ROW_NUMBER ( )
OVER ( PARTITION BY ID order by Name) as rn from Table
Where rn = 1)
as tbl
Order by determines the order from which the first row will be picked.

Distinct SQL Query

I have a SQL Server 2008 database with the following information in a table:
ID Name
-- ----
1 John
2 Jill
3 John
4 Phil
5 Matt
6 Jill
I want to display the unique names in a drop down list. Because of this, I need just one of the IDs associated with the unique name. I know it's dirty. I didn't create this mess. I just need the unique names with one of the ids. How do I write a query that will do that? I know that the following won't work because of the ID field.
SELECT DISTINCT
[ID], [Name]
FROM
MyTable
SELECT MIN(ID) AS ID, [Name]
FROM MyTable
GROUP BY [Name]
This will return the first (i.e. MINimum) ID for each distinct Name
You could also do it with rank over function
SELECT
Id,
Name
FROM
(
SELECT
Id,
[Name],
RANK() OVER (PARTITION BY [Name] Order By Id) As Idx
FROM Test
) A
WHERE Idx = 1
To get understanding about rank over function read this:
http://msdn.microsoft.com/en-us/library/ms176102.aspx