Compare 2 tables based on range values - sql

We have big transaction tables, it has all the values (including duplicates), need to eliminate the duplicate values based on other table values.
Table A (Transaction table) has Store, Date, Index , Etc values
Table B maintain the Index ranges, it has Store, Date, Index Begin, Index End etc.
Based on Store, Date need to compare index from table A with Table B (Table B has index Range values), eliminate the ranges of index values from Table A, so I can avoid duplicate values.
If the given index is not in range of Index Begin and Index End, I can keep that. Indexes range starts from 1. But I need to keep 1, it's a header record.
It has to check from Index 2 onwards. If you could please help with SQL statement that would be great.
Tried with few statements, did not work.
Need to eliminate duplicate records based on Index ranges from table B

To eliminate the duplicates use the key word DISTINCT after SELECT, so SELECT DISTINCT. You'll need to write a JOIN statement that compares the two tables based on the common value.
I assume you already have a query so I won't write one unless you comment needing help:)

Related

Nulls in one of the columns in a composite unique index

I have a unique index on (id, name) columns. I have a date column that I want to add to the index since I want the uniqueness to be based on (id, name, date) columns. The date column contains a lot of null values. How would it affect the index?
If you are using SQL Server, so in SQL Server null values are not included in the index structure, But SQL Server has some new features, one of the filtering index. If a field has many null values so recommended creating an additional filtering index using where the field is null condition.
For more information about filtering index visit this link
Final result: You can do your add index operations comfortably, without problems, in many Databases null values don't affect performance.

Remove duplicates from table which doesn't have any key column

I have 2 tables TabA and TabB. Both don't have any key columns.
Column wise both are replica and have more than 80 columns.
TabA has 30 million records. TabB has 2000 records only.
Now I need to compare all the columns between both tables since NO key column is there and remove duplicate records from TabA.
I would like to find best approach to compare both tables instead of placing all 80 columns either in JOINS or WHERE clause.
If all 80 columns are needed to identify the record, you'd have a hard time not using all of them in your query, one way or another.
You could calculate a hashcode using HASHBYTES() on all the columns, and then compare only the resulting hashcode.
There's also CHECKSUM(*) function that calculates hash on all the columns, without the need to explicitly list them, but it returns int as a result, which is too weak if false positives now and then are not acceptable.

How to update numerical column of one table based on matching string column from another table in SQL

I want to update numerical columns of one table based on matching string columns from another table.i.e.,
I have a table (let's say table1) with 100 records containing 5 string (or text) columns and 10 numerical columns. Now I have another table that has the same structure (columns) and 20 records. In this, few records contain updated data of table1 i.e., numerical columns values are updated for these records and rest are new (both text and numerical columns).
I want to update numerical columns for records with the same text columns (in table1) and insert new data from table2 into table1 where text columns are also new.
I thought of taking an intersect of these two tables and then update but couldn't figure out the logic as how can I update the numerical columns.
Note: I don't have any primary or unique key columns.
Please help here.
Thanks in advance.
The simplest solution would be to use two separate queries, such as:
UPDATE b
SET b.[NumericColumn] = a.[NumericColumn],
etc...
FROM [dbo].[SourceTable] a
JOIN [dbo].[DestinationTable] b
ON a.[StringColumn1] = b.[StringColumn1]
AND a.[StringColumn2] = b.[StringColumn2] etc...
INSERT INTO [dbo].[DestinationTable] (
[NumericColumn],
[StringColumn1],
[StringColumn2],
etc...
)
SELECT a.[NumericColumn],
a.[StringColumn1],
a.[StringColumn2],
etc...
FROM [dbo].[SourceTable] a
LEFT JOIN [dbo].[DestinationTable] b
ON a.[StringColumn1] = b.[StringColumn1]
AND a.[StringColumn2] = b.[StringColumn2] etc...
WHERE b.[NumericColumn] IS NULL
--assumes that [NumericColumn] is non-nullable.
--If there are no non-nullable columns then you
--will have to structure your query differently
This will be effective if you are working with a small dataset that does not change very frequently and you are not worried about high contention.
There are still a number of issues with this approach - most notably what happens if either the source or destination table is accessed and/or modified while the update statement is running. Some of these issues can be worked around other ways but so much depends on the context of how the tables are used that it is difficult to provide a more effective generically-applicable solution.

Use trigger to calculate difference between rows in SQLite

Given a table structure like this:
ID|Measurement|Diff|Date
where ID and Date is the composite primary key, and rows are further indexed by the Date column.
I want to use a trigger (after an insert or replace into) to calculate the Diff column for the table. The Diff column simply records the differences in the values of Measurement between two adjacent dates for the same ID.
What is the optimal way of doing this in SQLite? Performance is crucial here, since the table is large, i.e. 1M+ rows.
The query to calculate the value should be something like this:
update structure
set new.diff = new.measurement - (select s.measurement
from structure s
where date < new.date
order by date desc
limit 1)
where id = new.id;
The update should use the primary key index to quickly identify the row. The subquery should use the date index to quickly find the previous row. So, this should have reasonable performance.

Returning an Access recordset with zeros instead of nulls

Here's the problem:
I have an Access query that feeds a report, which sometimes doesn't return any records for certain criteria. I would like to display zeros in the report instead of an empty line (an empty recordset is currently being returned).
Is there an SQL solution that (perhaps using some kind of union statement and/or nested SQL) always returns one record (with zeros) if there are not matching records from the initial query?
One possible solution would be to create a second table with the same primary key, and add just one record. In your query, choose as join type all records in the second table, including those with no matching records in the first one. Select as output all fields in the first table.
You can materialize a one-row table with zero for all columns. This is a slight pain to achieve in Access (ACE, Jet, whatever) because it doesn't support row constructors and the FROM must resolve to a base table. In other words, you'll need a table that is guaranteed to always contain at least one row.
This isn't a problem for me because my databases always include auxilliary tables e.g. a calendar table, a sequence table of integers, etc. For exmaple, to materialize a table one-row, all-zeros table using my 3000 row Calendar table:
SELECT DISTINCT 0 AS c
FROM Calendar;
I can then UNION my query with my materialized table but include an antijoin to ensure the all-zeros row only appears in the resultset when my query is the empty set:
SELECT c
FROM T
UNION
SELECT 0
FROM Calendar
WHERE NOT EXISTS (
SELECT c
FROM T
);
Note the use of UNION allows me to remove the DISTINCT keyword and the AS clause ("column alias") from the materialized table.