I am trying to tidy up a database removing duplicates.
Let's say I have a database that looks like this one
Date | Value1 | Value2
01/01/2018 A B
01/01/2018 B A
02/01/2018 A B
In this case, according to my needs, the first two rows are identical, so I would like to drop one (let's say the second one). If I use SELECT DISTINCT it won't drop it because they are, in fact, different.
Is there an easy procedure to do so?
Thank you
You could use the exists operator to find (and delete!) them:
DELETE FROM mytable a
WHERE EXISTS (SELECT *
FROM mytable b
WHERE a.value1 = b.value2 AND
a.value2 = b.value1 AND
a.value1 < b.value1)
Here's one possibility:
create table t(date,value1,value2);
insert into t values
('01/01/2018','A','B'),
('01/01/2018','B','A'),
('02/01/2018','A','B');
select distinct date,min(value1,value2) value1,max(value1,value2) value2 from t;
Related
I am new to sql and are trying to combine a column value from three different tables and combine to one row in DB2 Warehouse on Cloud. Each table consists of only one row and unique column name. So what I want to is just join these three to one row their original column names.
Each table is built from a statement that looks like this:
SELECT SUM(FUEL_TEMP.FUEL_MLAD_VALUE) AS FUEL
FROM
(SELECT ML_ANOMALY_DETECTION.MLAD_METRIC AS MLAD_METRIC, ML_ANOMALY_DETECTION.MLAD_VALUE AS FUEL_MLAD_VALUE, ML_ANOMALY_DETECTION.TAG_NAME AS TAG_NAME, ML_ANOMALY_DETECTION.DATETIME AS DATETIME, DATA_CONFIG.SYSTEM_NAME AS SYSTEM_NAME
FROM ML_ANOMALY_DETECTION
INNER JOIN DATA_CONFIG ON
(ML_ANOMALY_DETECTION.TAG_NAME =DATA_CONFIG.TAG_NAME AND
DATA_CONFIG.SYSTEM_NAME = 'FUEL')
WHERE ML_ANOMALY_DETECTION.MLAD_METRIC = 'IFOREST_SCORE'
AND ML_ANOMALY_DETECTION.DATETIME >= (CURRENT DATE - 9 DAYS)
ORDER BY DATETIME DESC)
AS FUEL_TEMP
I have tried JOIN, INNER JOIN, UNION/UNION ALL, but can't get it to work as it should. How can I do this?
Use a cross-join like this:
create table table1 (field1 char(10));
create table table2 (field2 char(10));
create table table3 (field3 char(10));
insert into table1 values('value1');
insert into table2 values('value2');
insert into table3 values('value3');
select *
from table1
cross join table2
cross join table3;
Result:
field1 field2 field3
---------- ---------- ----------
value1 value2 value3
A cross join joins all the rows on the left with all the rows on the right. You will end up with a product of rows (table1 rows x table2 rows x table3 rows). Since each table only has one row, you will get (1 x 1 x 1) = 1 row.
Using UNION should solve your problem. Something like this:
SELECT
WarehouseDB1.WarehouseID AS TheID,
'A' AS TheSystem,
WarehouseDB1.TheValue AS TheValue
FROM WarehouseDB1
UNION
SELECT
WarehouseDB2.WarehouseID AS TheID,
'B' AS TheSystem,
WarehouseDB2.TheValue AS TheValue
FROM WarehouseDB2
UNION
WarehouseDB3.WarehouseID AS TheID,
'C' AS TheSystem,
WarehouseDB3.TheValue AS TheValue
FROM WarehouseDB3
Ill adapt the code with your table names and rows if you tell me what they are. This kind of query would return something like the following:
TheID TheSystem TheValue
1 A 10
2 A 20
3 B 30
4 C 40
5 C 50
As long as your column names match in each query, you should get the desired results.
I have two tables table which are identical in structure but belong to different schemas (schemas A and B). All rows in question will always appear in the A.table but may or may not appear in B.table. B.table is essentially an override for the defaults in A.table.
As such my query uses a COALESCE on each field similar to:
SELECT COALESCE(B.id, A.id) as id,
COALESCE(B.foo, A.foo) as foo,
COALESCE(B.bar, A.bar) as bar
FROM A.table LEFT JOIN B.table ON (A.id = B.id)
WHERE A.id in (1, 2, 3)
This works great, but I also want to add the source of the data. In the example above, assuming id=2 existed in B.table but not 1 or 3, I would want to include some indication that A is the source for 1 and 3 and B is the source for 2.
So the data might look like the following
+---------------------------------+
| id | foo | bar | source |
+---------------------------------+
| 1 | a | b | A |
| 2 | c | d | B |
| 3 | e | f | A |
+---------------------------------+
I don't really care what the value of source is as long as I can distinguish A from B.
I am no pgsql expert (not by a long shot) but I have tinkered around with EXISTS and a subquery but have had no luck so far.
As records showing the default value (from A.table) have NULLs for B.id, all you need is to add this column specification to your query:
CASE WHEN B.id IS NULL THEN 'A' ELSE 'B' END AS Source
The USING clause would simplify the query you have:
SELECT id
, COALESCE(B.foo, A.foo) AS foo
, COALESCE(B.bar, A.bar) AS bar
, CASE WHEN b.id IS NULL THEN 'A' ELSE 'B' END AS source -- like #Terje provided
FROM a
LEFT JOIN b USING (id)
WHERE a.id IN (1, 2, 3);
But typically, this alternative query should serve you better:
SELECT x.* -- or list columns of your choice
FROM (VALUES (1), (2), (3)) t (id)
, LATERAL (
SELECT *, 'B' AS source FROM b WHERE id = t.id
UNION ALL
SELECT *, 'A' FROM a WHERE id = t.id
LIMIT 1
) x
ORDER BY x.id;
Advantages:
You don't have to add another COALESCE construct for every column you want to add to the result.
The same query works for any number of columns in a and b.
The query even works if the column names are not identical. Only number and data types of columns must match.
Of course, you can always list selected, compatible columns as well:
SELECT * -- or list columns of your choice
FROM (VALUES (1), (2), (3)) t (id)
, LATERAL (
SELECT foo, bar, 'B' AS source FROM b WHERE id = t.id
UNION ALL
SELECT foo2, bar17, 'A' FROM a WHERE id = t.id
LIMIT 1
) x
ORDER BY x.id;
The first SELECT determines names, data types and number of columns.
This query doesn't break if columns in b are not defined NOT NULL.
COALESCE cannot tell the difference between b.foo IS NULL and no row with matching id in b. So the source of any result column (except id) can still be 'A', even if the result row says 'B' - if any relevant column in b can be NULL.
My alternative returns all values from b if the row exists - including NULL values. So the result can be different if columns in b can be NULL. It depends on your requirements which behavior is desirable.
Either query assumes that id is defined as primary key (so exactly 1 or 0 rows per given id value).
Related:
Select first record if none match
What is the difference between LATERAL and a subquery in PostgreSQL?
I have a DB2 table with two columns A and B storing alphanumeric values.I want to find whether a value(MyValue) exist in between A and B. I want my result to be : MyValue | A | B
I could have used: SELECT 'MyValue',A,B FROM TABLENAME WHERE(A < 'MyValue' AND B > 'MyValue') but the constraint is, I have to search more than 10000 values with single query and want the result as below format.
Ex. value=W140686,0032090,0045790...etc
Expected Result:
MyValue A B
W140686 | W000000 | W999999
0032090 | 0000001 | 0500000
0045790 | 0000001 | 0500000
..
..
..
Any help / suggestion would be highly appreciated.
Thanks in advance.
The BETWEEN predicate includes the end points. In this case, the result set will include the string 'MyValue' if it's equal to either the string in column A or the string in column B.
SELECT 'MyValue',A,B
FROM TABLENAME
WHERE 'MyValue' BETWEEN A AND B;
If you want to exclude the end points . . .
SELECT 'MyValue',A,B
FROM TABLENAME
WHERE 'MyValue' > A AND 'MyValue' < B;
For performance, you might want one index on the pair of columns {A, B}. For details, read the execution plan.
If you need to supply 10000 values to the WHERE clause, you should probably store them in a temporary table. Join the temporary table to TABLENAME.
-- DB2 syntax to create a temporary table is different. Look it up.
create temp table my_values (
MyValue varchar(10) primary key
);
-- Insert the 10000 values here.
insert into my_values values
('W140686'),('0032090'), ('0045790');
select T1.MyValue, T2.a, T2.b
from my_values T1
inner join TABLENAME T2
on T1.MyValue > T2.a and T1.MyValue < T2.b
The join predicate is the same as the WHERE clause in my earlier queries.
If I have to search for some data I can use wildcards and use a simple query -
SELECT * FROM TABLE WHERE COL1 LIKE '%test_string%'
And, if I have to look through many values I can use -
SELECT * FROM TABLE WHERE COL1 IN (Select col from AnotherTable)
But, is it possible to use both together. That is, the query doesn't just perform a WHERE IN but also perform something similar to WHERE LIKE? A query that just doesn't look through a set of values but search using wildcards through a set of values.
If this isn't clear I can give an example. Let me know. Thanks.
Example -
lets consider -
AnotherTable -
id | Col
------|------
1 | one
2 | two
3 | three
Table -
Col | Col1
------|------
aa | one
bb | two
cc | three
dd | four
ee | one_two
bb | three_two
Now, if I can use
SELECT * FROM TABLE WHERE COL1 IN (Select col from AnotherTable)
This gives me -
Col | Col1
------|------
aa | one
bb | two
cc | three
But what if I need -
Col | Col1
------|------
aa | one
bb | two
cc | three
ee | one_two
bb | three_two
I guess this should help you understand what I mean by using WHERE IN and LIKE together
SELECT *
FROM TABLE A
INNER JOIN AnotherTable B on
A.COL1 = B.col
WHERE COL1 LIKE '%test_string%'
Based on the example code provided, give this a try. The final select statement presents the data as you have requested.
create table #AnotherTable
(
ID int IDENTITY(1,1) not null primary key,
Col varchar(100)
);
INSERT INTO #AnotherTable(col) values('one')
INSERT INTO #AnotherTable(col) values('two')
INSERT INTO #AnotherTable(col) values('three')
create table #Table
(
Col varchar(100),
Col1 varchar(100)
);
INSERT INTO #Table(Col,Col1) values('aa','one')
INSERT INTO #Table(Col,Col1) values('bb','two')
INSERT INTO #Table(Col,Col1) values('cc','three')
INSERT INTO #Table(Col,Col1) values('dd','four')
INSERT INTO #Table(Col,Col1) values('ee','one_two')
INSERT INTO #Table(Col,Col1) values('ff','three_two')
SELECT * FROM #AnotherTable
SELECT * FROM #Table
SELECT * FROM #Table WHERE COL1 IN(Select col from #AnotherTable)
SELECT distinct A.*
FROM #Table A
INNER JOIN #AnotherTable B on
A.col1 LIKE '%'+B.Col+'%'
DROP TABLE #Table
DROP TABLE #AnotherTable
Yes. Use the keyword AND:
SELECT * FROM TABLE WHERE COL1 IN (Select col from AnotherTable) AND COL1 LIKE '%test_string%'
But in this case, you are probably better off using JOIN syntax:
SELECT TABLE.* FROM TABLE JOIN AnotherTable on TABLE.COL1 = AnotherTable.col WHERE TABLE.COL1 LIKE '%test_string'
no because each element in the LIKE clause needs the wildcard and there's not a way to do that with the IN clause
The pattern matching operators are:
IN, against a list of values,
LIKE, against a pattern,
REGEXP/RLIKE against a regular expression (which includes both wildcards and alternatives, and is thus closest to "using wildcards through a set of valuws", e.g. (ab)+a|(ba)+b will match all strings aba...ba or bab...ab),
FIND_IN_SET to get the index of a string in a set (which is represented as a comma separated string),
SOUNDS LIKE to compare strings based on how they're pronounced and
MATCH ... AGAINST for full-text matching.
That's about it for string matching, though there are other string functions.
For the example, you could try joining on Table.Col1 LIKE CONCAT(AnotherTable.Col, '%'), though performance will probably be dreadful (assuming it works).
Try a cross join, so that you can compare every row in AnotherTable to every row in Table:
SELECT DISTINCT t.Col, t.Col1
FROM AnotherTable at
CROSS JOIN Table t
WHERE t.col1 LIKE ('%' + at.col + '%')
To make it safe, you'll need to escape wildcards in at.col. Try this answer for that.
If I understand the question correctly you want the rows from "Table" when "Table.Col1" is IN "AnotherTable.Col" and you also want the rows when Col1 IS LIKE '%some_string%'.
If so you want something like:
SELECT
t.*
FROM
[Table] t
LEFT JOIN
[AnotherTable] at ON t.Col1 = at.Col
WHERE (at.Col IS NOT NULL
OR t.Col1 LIKE '%some_string%')
Something like this?
SELECT * FROM TABLE
WHERE
COL1 IN (Select col from AnotherTable)
AND COL1 LIKE '%test_string%'
Are you thinking about something like EXISTS?
SELECT * FROM TABLE t WHERE EXISTS (Select col from AnotherTable t2 where t2.col = t.col like '%test_string%' )
i need Add Row Numbers To a SELECT Query without using Row_Number() function.
and without using user defined functions or stored procedures.
Select (obtain the row number) as [Row], field1, field2, fieldn from aTable
UPDATE
i am using SAP B1 DIAPI, to make a query , this system does not allow the use of rownumber() function in the select statement.
Bye.
I'm not sure if this will work for your particular situation or not, but can you execute this query with a stored procedure? If so, you can:
A) Create a temp table with all your normal result columns, plus a Row column as an auto-incremented identity.
B) Select-Insert your original query, sans the row column (SQL will fill this in automatically for you)
C) Select * on the temp table for your result set.
Not the most elegant solution, but will accomplish the row numbering you are wanting.
This query will give you the row_number,
SELECT
(SELECT COUNT(*) FROM #table t2 WHERE t2.field <= t1.field) AS row_number,
field,
otherField
FROM #table t1
but there are some restrictions when you want to use it. You have to have one column in your table (in the example it is field) which is unique and numeric and you can use it as a reference. For example:
DECLARE #table TABLE
(
field INT,
otherField VARCHAR(10)
)
INSERT INTO #table(field,otherField) VALUES (1,'a')
INSERT INTO #table(field,otherField) VALUES (4,'b')
INSERT INTO #table(field,otherField) VALUES (6,'c')
INSERT INTO #table(field,otherField) VALUES (7,'d')
SELECT * FROM #table
returns
field | otherField
------------------
1 | a
4 | b
6 | c
7 | d
and
SELECT
(SELECT COUNT(*) FROM #table t2 WHERE t2.field <= t1.field) AS row_number,
field,
otherField
FROM #table t1
returns
row_number | field | otherField
-------------------------------
1 | 1 | a
2 | 4 | b
3 | 6 | c
4 | 7 | d
This is the solution without functions and stored procedures, but as I said there are the restrictions. But anyway, maybe it is enough for you.
RRUZ, you might be able to hide the use of a function by wrapping your query in a View. It would be transparent to the caller. I don't see any other options, besides the ones already mentioned.