how to compare two rows in one mdb table? - sql

I have one mdb table with the following structure:
Field1 Field2 Field3 Field4
A ...
B ...
I try to use a query to list all the different fields of row A and B in a result-set:
SELECT * From Table1
WHERE Field1 = 'A'
UNION
SELECT * From Table1
WHERE Field1 = 'B';
However this query has two problems:
it list all the fields including the
identical cells, with a large table
it gives out an error message: too
many fields defined.
How could i get around these issues?

Is it not easiest to just select all fields needed from the table, based on the Field1 value and group on the values needed?
So something like this:
SELECT field1, field2,...field195
FROM Table1
WHERE field1 = 'A' or field1 = 'B'
GROUP BY field1, field2, ....field195
This will give you all rows where field1 is A or B and there is a difference in one of the selected fields.
Oh and for the group by statement as well as the SELECT part, indeed use the previously mentioned edit mode for the query. There you can add all fields (by selecting them in the table and dragging them down) that are needed in the result, then click the 'totals' button in the ribbon to add the group by- statements for all. Then you only have to add the Where-clause and you are done.
Now that the question is more clear (you want the query to select fields instead of records based on the particular requirements), I'll have to change my answer to:
This is not possible.
(untill proven otherwise) ;)
As far as I know, a query is used to select records using for example the where clause, never used to determine which fields should be shown depending on a certain criterium.
One thing that MIGHT help in this case is to look at the database design. Are those tables correctly made?
Suppose you have 190 of those fields that are merely details of the main data. You could separate this in another table, so you have a main table and details table.
The details table could look something like:
ID ID_Main Det_desc Det_value
This way you can filter all Detail values that are equal between the two main values A and B using something like:
Select a.det_desc, a.det_value, b.det_value
(Select Det_desc, det_value
from tblDetails
where id_main = a) as A inner join
(Select Det_desc, det_value
from tblDetails
where id_main = a) as B
on A.det_desc = B.det_desc and A.det_value <> B.det_value
This you can join with your main table again if needed.

You can full join the table on itself, matching identical rows. Then you can filter on mismatches if one of the two join parts is null. For example:
select *
from (
select *
from Table1
where Field1 = 'A'
) A
full join
(
select *
from Table1
where Field1 = 'B'
) B
on A.Field2 = B.Field2
and A.Field3 = B.Field3
where A.Field1 is null
or B.Field1 is null
If you have 200 fields, ask Access to generate the column list by creating a query in design view. Switch to SQL view and copy/paste. An editor with column mode (like UltraEdit) will help create the query.

Related

I need to create a VIEW from an existing TABLE and MAP an additional COLUMN to that VIEW

I am fairly new to SQL. What I am trying to do is create a view from an existing table. I also need to add a new column to the view which maps to the values of an existing column in the table.
So within the view, if the value in a field for Col_1 = A, then the value in the corresponding row for New_Col = C etc
Does this even make sense? Would I use the CASE clause? Is mapping in this way even possible?
Thanks
The best way to do this is to create a mapping or lookup table
For example consider the following LOOKUP table.
COL_A NEW_VALUE
---- -----
A C
B D
Then you can have a query like this:
SELECT A.*, LOOK.NEW_VALUE
FROM TABLEA AS A
JOIN LOOKUP AS LOOK ON A.COL_A = LOOK.COL_A
This is what DimaSUN is doing in his query too -- but in his case he is creating the table dynamically in the body of the query.
Also note, I'm using a JOIN (which is an inner join) so only results in the lookup table will be returned. This could filter the results. A LEFT JOIN there would return all data from A but some of the new columns might be null.
Generally, a view is an instance of a table/a replica provided that there is no alteration to the original table. So, as per your query you can manipulate the data and columns in a view by using case.
Create View viewname as
Select *,
case when column=a.value then 'C'
....
ELSE
END
FROM ( Select * from table) a
If You have restricted list of replaced values You may hardcode that list in query
select T.*,map.New_Col
from ExistingTable T
left join (
values
('A','C')
,('B','D')
) map (Col_1,New_Col) on map.Col_1 = T.Col_1
In this sample You hardcode 'A' -> 'C' and 'B' -> 'D'
In general case You better may to use additional table ( see Hogan answer )

SQL - Find duplicate fields and count how many fields are matched

I have a a large customer database where customers have been added multiple times in some circumstances which is causing problems. I am able to use a query to identify the records which are an exact match, although some records have slight variations such as different addresses or given names.
I want to query across 10 fields, some records will match all 10 which is clearly a duplicate although other fields may only match 5 fields with another record and require further investigation. Therefore i want to create a results set which has field with a count how many fields have been matched. Basically to create a rating of the likely hood the result is an actual match. All 10 would be a clear dup but 5 would only be a possible duplicate.
Some will only match on POSTCODE and FIRSTNAME which is generally can be discounted.
Something like this helps but as it only returns records which explicitly match on all 3 records its not really useful due the sheer amount of data.
SELECT field1,field2,field3, count(*)
FROM table_name
GROUP BY field1,field2,field3
HAVING count(*) > 1
You are just missing the magic of CUBE(), which generates all the combinations of columns automatically
DECLARE #duplicate_column_threshold int = 5;
WITH cte AS (
SELECT
field1,field2,...,field10
,duplicate_column_count = (SELECT COUNT(col) FROM (VALUES (field1),(field2),...,(field10)) c(col))
FROM table_name
GROUP BY CUBE(field1,field2,...,field10)
HAVING COUNT(*) > 1
)
SELECT *
INTO #duplicated_rows
FROM cte
WHERE duplicate_column_count >= #duplicate_column_threshold
Update: to fetch the rows from the original table, join it against the #duplicated_rows using a technique that treats NULLs as wildcards when comparing the columns.
SELECT
a.*
,b.duplicate_column_count
FROM table_name a
INNER JOIN #duplicated_rows b
ON NULLIF(b.field1,a.field1) IS NULL
AND NULLIF(b.field2,a.field2) IS NULL
...
AND NULLIF(b.field10,a.field10) IS NULL
You might try something like
Select field1, field2, field3, ... , field10, count(1)
from customerdatabase
group by field1, field2, field3, ... , field10
order by field1, field2, field3, ... , field10
Where field1 through field10 are ordered by the "most identifiable/important" to least.
This is as close I've got to what i'm trying to achieve, which will return all records which have any duplicate fields. I want to add a column to the results which indicate how many fields have matched any other record in the table. There are around 40,000 records in total.
select * from [CUST].[dbo].[REPORTA] as a
where exists
(select [GIVEN.NAMES],[FAMILY.NAME],[DATE.OF.BIRTH],[POST.CODE],[STREET],[TOWN.COUNTRY]
from [CUST].[dbo].[REPORTA] as b
where a.[GIVEN.NAMES] = b.[GIVEN.NAMES]
or a.[FAMILY.NAME] = b.[FAMILY.NAME]
or a.[DATE.OF.BIRTH] = b.[DATE.OF.BIRTH]
or a.[POST.CODE] = b.[POST.CODE]
or a.[STREET] = b.[STREET]
or a.[TOWN.COUNTRY] = b.[TOWN.COUNTRY]
group by [GIVEN.NAMES],[FAMILY.NAME],[DATE.OF.BIRTH],[POST.CODE],[STREET],[TOWN.COUNTRY]
having count(*) >= 1)
This query will return thousands of records but I'm generally interested in the record with a high count of exactly matching fields

Writing a single UPDATE statement that prevents duplicates

I've been trying for a few hours (probably more than I needed to) to figure out the best way to write an update sql query that will dissallow duplicates on the column I am updating.
Meaning, if TableA.ColA already has a name 'TEST1', then when I'm changing another record, then I simply can't pick a value for ColA to be 'TEST1'.
It's pretty easy to simply just separate the query into a select, and use a server layer code that would allow conditional logic:
SELECT ID, NAME FROM TABLEA WHERE NAME = 'TEST1'
IF TableA.recordcount > 0 then
UPDATE SET NAME = 'TEST1' WHERE ID = 1234
END IF
But I'm more interested to see if these two queries can be combined into a single query.
I am using Oracle to figure things out, but I'd love to see a SQL Server query as well. I figured a MERGE statement can work, but for obvious reasons you can't have the clause:
..etc.. WHEN NOT MATCHED UPDATE SET ..etc.. WHERE ID = 1234
AND you can't update a column if it's mentioned in the join (oracle limitation but not limited to SQL Server)
ALSO, I know you can put a constraint on a column that prevents duplicate values, but I'd be interested to see if there is such a query that can do this without using constraint.
Here is an example start-up attempt on my end just to see what I can come up with (explanations on it failed is not necessary):
ERROR: ORA-01732: data manipulation operation not legal on this view
UPDATE (
SELECT d.NAME, ch.NAME FROM (
SELECT 'test1' AS NAME, '2722' AS ID
FROM DUAL
) d
LEFT JOIN TABLEA a
ON UPPER(a.name) = UPPER(d.name)
)
SET a.name = 'test2'
WHERE a.name is null and a.id = d.id
I have tried merge, but just gave up thinking it's not possible. I've also considered not exists (but I'd have to be careful since I might accidentally update every other record that doesn't match a criteria)
It should be straightforward:
update personnel
set personnel_number = 'xyz'
where person_id = 1001
and not exists (select * from personnel where personnel_number = 'xyz');
If I understand correctly, you want to conditionally update a field, assuming the value is not found. The following query does this. It should work in both SQL Server and Oracle:
update table1
set name = 'Test1'
where (select count(*) from table1 where name = 'Test1') > 0 and
id = 1234

Minus operator in sql

I am trying to create a sql query with minus.
I have query1 which returns 28 rows with 2 columns
I have query2 which returns 22 row2 with same 2 columns in query 2.
when I create a query query1 minus query 2 it should have only show the 28-22=6 rows.
But it showing up all the 28 rows returned by query1.
Please advise.
Try using EXCEPT instead of MINUS. For Example:
Lets consider a case where you want to find out what tasks are in a table that haven't been assigned to you(So basically you are trying to find what tasks could be available to do).
SELECT TaskID, TaskType
FROM Tasks
EXCEPT
SELECT TaskID, TaskType
FROM Tasks
WHERE Username = 'Vidya'
That would return all the tasks that haven't been assigned to you. Hope that helps.
If MINUS won't work for you, the general form you want is the main query in the outer select and a variation of the other query in a not exists clause.
select <insert list of fields here>
from mytable a
join myothertable b
on b.aId = a.aid
where not exists (select * from tablec c where a.aid = c.aid)
The fields might not be exactly alike. may be one of the fields is char(10) and the other is char(20) and they both have the string "TEST" in them. They might "look" the same.
If the database you are working on supports "INTERSECT", try this query and see how many are perfectly matching results.
select field1, field2 from table1
intersect
select field1, field2 from table2
To get the results you are expecting, this query should give you 22 rows.
something like this:
select field1, field2, . field_n
from tables
MINUS
select field1, field2, . field_n
from tables;
MINUS works on the same principle as it does in the set operations. Suppose if you have set A and B,
A = {1,2,3,4}; B = {3,5,6}
then, A-B = {1,2,4}
If A = {1,3,5} and B = {2,4,6}
then, A-B = {1,3,5}. Here the count(A) before and after the MINUS operation will be the same, as it does not contain any overlapping terms with set B.
On similar lines, may be the result set obtained in query 2 may not have matching terms with the result of query1. Hence you are still getting 28 instead of 6 rows.
Hope this helps.
It returns the difference records in the upper query which are not contained by the second query.
In your case for example
A={1,2,3,4,5...28} AND B={29,30} then A-B={1,2,3....28}

Optionally use a UNION from another table in T-SQL without using temporary tables or dynamic sql?

I have two sql server tables with the same structure. In a stored procedure I have a Select from the first table. Occasionally I want to select from the second table as well based on a passed in parameter.
I would like a way to do this without resorting to using dynamic sql or temporary tables.
Pass in param = 1 to union, anything else to only return the first result set:
select field1, field2, ... from table1 where cond
union
select field1, field2, ... from table2 where cond AND param = 1
If they are both the exact same structure, then why not have a single table with a parameter that differentiates the two tables? At that point, it becomes a simple matter of a case statement on the parameter on which results set you receive back.
A second alternative is dual result sets. You can select multiple result sets out of a stored procedure. Then in code, you would either use DataReader.NextResult or DataSet.Tables(1) to get at the second set of data. It will then be your code's responsibility to place them into the same collection or merge the two tables.
A THIRD possibility is to utilize an IF Statement. Say, pass in an integer with the expected possible values of 1,2, 3 and then have something along this in your actual stored procedure code
if #Param = 1 Then
Select From Table1
if #Param = 2 THEN
Select From Table2
if #Param = 3 Then
Select From Table1 Union Select From Table 2
A fourth possibility would be to have two distinct procedures one which runs a union and one which doesn't and then make it your code's responsibility to determine which one to call based on that parameter, something like:
myCommandObject.CommandText = IIf(myParamVariable = true, "StoredProc1", StoredProc2")
It's pretty easy.
/* Always return tableX */
select colA, colB
from tableX
union
select colA, colB
from tableY
where #parameter = 'IncludeTableY' /* Will union with an empty set otherwise */
If this isn't immediately apparent (it often isn't), consider the examples below. The primary thing to remember is that the if the where clause evaluates to true for a row, it is returned otherwise it's discarded.
This always evaluates to true so every row is returned.
select *
from tableX
where 1 = 1
This always evaluates to false so no rows are returned (sometimes used as a quick and dirty get-me-the-columns query).
select *
from tableX
where 1 = 0
this will return values from either table, depending on if you passed a value on the parameter
select field1, field2, ... from table1 where #p1 is null
union
select field1, field2, ... from table2 where #p1 is not null
you just need to add the rest of your criteria for the where clause
Use a view.
CREATE view_both
AS
SELECT *, 1 AS source
FROM table1
UNION ALL
SELECT *, 2 AS source
FROM table2
SELECT * FROM view_both WHERE source < #source_flag
The optimizer determines which, or both, tables to use based on source without requiring it to be indexed.