SQL Server : Nested Select Query - sql

I have a SQL query returning results based on a where clause.
I would like to include some more results, from the same table, dependent on what is found in the first select.
My select returns rows with ID's that meet the where criteria. It does happen that the table has more rows with this ID, but that does not meet the initial where criteria. Rather than re querying the DB with a separate call, I would like to use one select statement to also get these extra rows with the same ID. ID is not the index/ID. Its a naming convention I am using here.
Pseudo: (two steps)
1: select * from table where condition=xxx
2: for each row returned, (select * from table where id=row.id)
I want to do:
select
id as thisID, field1, field2,
(select id, field1, field2 from table where id = thisID)
from
table
where
condition=xxx
I have multiple joins in my real query, and just cant get the above to work. I unfortunately can not supply the real query, but I get an error of:
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS. Invalid column name 'thisID'
My query works fine with the multiple joins, without the above. I am trying to retrieve these extra records as part of the current working query.
Example:
TABLE
select * from table where col3 = 'green'
id, col1, col2, col3
123 | blue | red | green
-------------------------
567 | blue | red | green
-------------------------
123 | blue | red | blue
-------------------------
890 | blue | red | green
-------------------------
I want to return all 4 rows, because although row 3 fails the where condition, it has the same col1 value as row 1 (123), and I need to include it, as it is part of a "set" that I need to locate / import, called / referenced by id=123.
What I am doing manually now, is getting row one, and then running another query based on row 1's ID, to get row 3 as well.

You can use Where IN
select id as thisID, field1, field2 from table
where id in
(select id from table where condition=xxx)

Try this
Let say you table is below and called #Temp
Id Col1 Col2 Col3
123 blue red green
567 blue red green
123 blue red blue
890 blue red green
Will get the id to a temp table
Create Table #T1(Id int)
Insert Into #T1
Select Id
From #Temp
Where Col3='green'
Then
Select distinct *
From #Temp
Where Id in (select Id from #T1) Or Col3='Green'
Which result all the rows from main table
Update
If you want to use the way you currently using, try something like below
select
id as thisID, field1, field2,
(select top 1 id from table where id = t.id) as Id,
(select top 1 field1 from table where id = t.id) as field1,
(select top 1 field2 from table where id = t.id) as field2,
from
table t
where
condition=xxx

Related

How to delete rows where more than 1 column matches another table?

I have two tables. One (let's call it table1) looks a bit like this:
account_number | offer_code
---------------|-----------
1 | 123
1 | 456
2 | 123
The other table (let's call it table2) looks a bit like this:
account_number | offer_code
---------------|-----------
1 | 123
I want to delete all rows from table1 where the account_number AND the offer_code match a row in table2. So afterwards table1 would look like this:
account_number | offer_code
---------------|-----------
1 | 456
2 | 123
I've tried the following, but it doesn't run:
DELETE
FROM TABLE1 A
INNER JOIN
TABLE2 B
ON A.ACCOUNT_NUMBER = B.ACCOUNT_NUMBER
AND A.OFFER_CODE = B.OFFER_CODE
;
I've also tried the following. It seems to run, but the sheer volume of data in both tables (65.5m rows in table1 and 9m in table2) mean it takes an impractically long time to do so (I was forced to kill the query after 3 hours).
DELETE
FROM TABLE1
WHERE CONCAT(ACCOUNT_NUMBER, OFFER_CODE) IN
(
SELECT CONCAT(ACCOUNT_NUMBER, OFFER_CODE)
FROM TABLE2
)
;
Does anyone know if there is a way to accomplish this efficiently please?
Databases do not like update and delete processes. They are exhausting. Depending on your application(carefully check this out!!!) you can try this:
create table table1_tmp
select * from table1
minus
select * from table2;
alter table table1 rename to table1_tmp2;
alter table table1_tmp rename to table1;

Remove duplicate rows based on specific columns

I have a table that contains these columns:
ID (varchar)
SETUP_ID (varchar)
MENU (varchar)
LABEL (varchar)
The thing I want to achieve is to remove all duplicates from the table based on two columns (SETUP_ID, MENU).
Table I have:
id | setup_id | menu | label |
-------------------------------------
1 | 10 | main | txt |
2 | 10 | main | txt |
3 | 11 | second | txt |
4 | 11 | second | txt |
5 | 12 | third | txt |
Table I want:
id | setup_id | menu | label |
-------------------------------------
1 | 10 | main | txt |
3 | 11 | second | txt |
5 | 12 | third | txt |
You can achieve this with a common table expression (cte)
with cte as (
select id, setup_id, menu,
row_number () over (partition by setup_id, menu, label) rownum
from atable )
delete from atable a
where id in (select id from cte where rownum >= 2)
This will give you your desired output.
Common Table Expression docs
Assuming a table named tbl where both setup_id and menu are defined NOT NULL and id is the PRIMARY KEY.
EXISTS will do nicely:
DELETE FROM tbl t0
WHERE EXISTS (
SELECT FROM tbl t1
WHERE t1.setup_id = t0.setup_id
AND t1.menu = t0.menu
AND t1.id < t0.id
);
This deletes every row where a dupe with lower id is found, effectively only keeping the row with the smallest id from each set of dupes. An index on (setup_id, menu) or even (setup_id, menu, id) will help performance with big tables a lot.
If there is no PK and no reliable UNIQUE (combination of) column(s), you can fall back to using the ctid. If NULL values can be involved, you need to specify how to deal with those.
Consider:
Delete duplicate rows from small table
How to delete duplicate rows without unique identifier
How do I (or can I) SELECT DISTINCT on multiple columns?
After cleaning up duplicates, add a UNIQUE constraint to prevent new dupes:
ALTER TABLE tbl ADD CONSTRAINT tbl_setup_id_menu_uni UNIQUE (setup_id, menu);
If you had an index on (setup_id, menu), drop that now. It's superseded by the UNIQUE constraint.
I have found a solution that fits me the best.
Here it is if anyone needs it:
DELETE FROM table_name
WHERE id IN
(SELECT id
FROM
(SELECT id,
ROW_NUMBER() OVER( PARTITION BY setup_id,
menu
ORDER BY id ) AS row_num
FROM table_name ) t
WHERE t.row_num > 1 );
link: https://www.postgresql.org/docs/current/queries-union.html
https://www.postgresql.org/docs/current/sql-select.html#SQL-DISTINCT
let's sat table name is a
select distinct on (setup_id,menu ) a.* from a;
Key point: The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.
Which means you can only order by setup_id,menu in this distinct on query scope.
Want the opposite:
EXCEPT returns all rows that are in the result of query1 but not in the result of query2. (This is sometimes called the difference between two queries.) Again, duplicates are eliminated unless EXCEPT ALL is used.
SELECT * FROM a
EXCEPT
select distinct on (setup_id,menu ) a.* from a;
You can try something along these lines to delete all but the first row in case of duplicates (please note that this is not tested in any way!):
DELETE FROM your_table WHERE id IN (
SELECT unnest(duplicate_ids[2:]) FROM (
SELECT array_agg(id) AS duplicate_ids FROM your_table
GROUP BY SETUP_ID, MENU
HAVING COUNT(*) > 1
)
)
)
The above collects the ids of the duplicate rows (COUNT(*) > 1) in an array (array_agg), then takes all but the first element in that array ([2:]) and "explodes" the id values into rows (unnest).
The outer query just deletes every id that ends up in that result.
For mysql the similar question is already answered here Find and remove duplicate rows by two columns
Try if any of the approach helps in this matter.
I like the below one for MySql:
ALTER IGNORE TABLE your_table ADD UNIQUE (SETUP_ID, MENU);
DELETE t1
FROM table_name t1
join table_name t2 on
(t2.setup_id = t1.setup_id or t2.menu = t1.menu) and t2.id < t1.id
There are many ways to find and delete all duplicate row(s) based on conditions. But I like inner join method, which works very fast even in a large amount of Data. Please check follows :
DELETE T1 FROM <TableName> T1
INNER JOIN <TableName> T2
WHERE
T1.id > T2.id AND
T1.<ColumnName1> = T2.<ColumnName1> AND T1.<ColumnName2> = T2.<ColumnName2>;
In your case you can write as follows :
DELETE T1 FROM <TableName> T1
INNER JOIN <TableName> T2
WHERE
T1.id > T2.id AND
T1.setup_id = T2. setup_id;
Let me know if you face any issue or need more help.

Oracle SQL - return record only if colB is the same for all of colA

I have a table like the following ( there is of course other data in the table):
Col A Col B
1 Red
1 Red
2 Blue
2 Green
3 Black
I am trying to return a value for Col A only when ALL the Col B values match, otherwise return null.
This will be used as part of another sql statement that will be passing the Col A value, ie
Select * from Table where Col A = 1
I need to return the value in Col B. The correct result in the above table would be Red,Black
any ideas ?
how about this?
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table t( id number, color varchar2(20));
insert into t values(1,'RED');
insert into t values(1,'RED');
insert into t values(2,'BLUE');
insert into t values(2,'GREEN');
insert into t values(3,'BLACK');
Query 1:
select color from t where id in (
select id
from t
group by id having min(color) = max(color) )
group by color
Results:
| COLOR |
|-------|
| RED |
| BLACK |
If you just want the values in A (rather than each row), then use group by:
select a
from table t
group by a
having min(b) = max(b);
Note: this ignores NULL values. If you want to treat them as an additional value, then add another condition:
select a
from table t
group by a
having min(b) = max(b) and count(*) = count(b);
It is also tempting to use count(distinct). In general, though, count(distinct) requires more processing effort than a min() and a max().
You can use a case statement.
select cola,
case when max(colb) = min(colb) and count(*) = count(colb) then max(colb)
end as colb
from tablename
group by cola
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table t( id number, color varchar2(20));
insert into t values(1,'RED');
insert into t values(1,'RED');
insert into t values(2,'BLUE');
insert into t values(2,'GREEN');
insert into t values(3,'BLACK');
Query 1:
select id
from t
group by id having min(color) = max(color)
Results:
| ID |
|----|
| 1 |
| 3 |
hope this is what you were looking for.. :)

MySQL get rows but prefer one column value over another

A bit of a strange one, I want to write a MySQL query that will get results from a table, but prefer one value of a column over another, ie
id name value prioirty
1 name1 value1 NULL
2 name1 value1 1
3 name2 value2 NULL
4 name3 value3 NULL
So here name1 has two entries, but one has a prioirty of 1. I want to get all the values from the table, but prefer the values with whatever priorty I'm after.
The results I'd be after would be
id name value prioirty
2 name1 value1 1
3 name2 value2 NULL
4 name3 value3 NULL
An equivalent way of saying it would be 'get all rows from the table, but prefer rows with a priority of x'.
This should do it:
SELECT
T1.id,
T1.name,
T1.value,
T1.priority
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON
T2.name = T1.name AND
T2.priority > COALESCE(T1.priority, -1)
WHERE
T2.id IS NULL
This also allows you to have multiple priority levels with the highest being the one that you want to return (if you had a 1 and 2, the 2 would be returned).
I will also say though that it does seem like there are some design problems in the DB. My approach would have been:
My_Table (id, name)
My_Values (id, priority, value)
with an FK on id to id. PKs on id in My_Table and id, priority in My_Values. Of course, I'd use appropriate table names too.
You need to redesign your table first.
It should be:
YourTable (Id, Name, Value)
YourTablePriority (PriorityId, Priority, Id)
Update:
select * from YourTable a
where a.Id not in
(select b.Id from YourTablePriority b)
This should work in sql server, you may need a little change to make it work in mysql.
Maybe something like:
SELECT id, name, value, priority FROM
table_name GROUP BY name ORDER BY priority
Although not having a database in front of me I can't test it...
If I understand correctly, you want the value of a name given a specific priority, or the value associated with a NULL priority. (You do not necessarily want the MAX(priority) that exists.)
Yes, you've got some awkward design issues which you should address, but let's solve the problem you do have at present (and you can later migrate to the problem you ought to have :) ):
mysql> SET #priority = 1; -- the priority we want, if recorded
mysql> PREPARE stmt FROM "
SELECT
t0.*
FROM
t t0
LEFT JOIN
(SELECT DISTINCT name, priority FROM t WHERE priority = ?) t1
ON t0.name = t1.name
WHERE
t0.priority = t1.priority
OR
t1.priority IS NULL
";
mysql> EXECUTE stmt USING #priority;
+----+-------+--------+----------+
| id | name | value | priority |
+----+-------+--------+----------+
| 2 | name1 | valueX | 1 |
| 3 | name2 | value2 | NULL |
| 4 | name3 | value3 | NULL |
+----+-------+--------+----------+
3 rows in set (0.00 sec)
(Note that I changed the prioritized value of "name1" to "valueX" in the above -- your original formulation had identical value values for "name1" regardless of priority, which made it hard for me to understand why you cared to discriminate one from the other.)

Adding Row Numbers To a SELECT Query Result in SQL Server Without use Row_Number() function

i need Add Row Numbers To a SELECT Query without using Row_Number() function.
and without using user defined functions or stored procedures.
Select (obtain the row number) as [Row], field1, field2, fieldn from aTable
UPDATE
i am using SAP B1 DIAPI, to make a query , this system does not allow the use of rownumber() function in the select statement.
Bye.
I'm not sure if this will work for your particular situation or not, but can you execute this query with a stored procedure? If so, you can:
A) Create a temp table with all your normal result columns, plus a Row column as an auto-incremented identity.
B) Select-Insert your original query, sans the row column (SQL will fill this in automatically for you)
C) Select * on the temp table for your result set.
Not the most elegant solution, but will accomplish the row numbering you are wanting.
This query will give you the row_number,
SELECT
(SELECT COUNT(*) FROM #table t2 WHERE t2.field <= t1.field) AS row_number,
field,
otherField
FROM #table t1
but there are some restrictions when you want to use it. You have to have one column in your table (in the example it is field) which is unique and numeric and you can use it as a reference. For example:
DECLARE #table TABLE
(
field INT,
otherField VARCHAR(10)
)
INSERT INTO #table(field,otherField) VALUES (1,'a')
INSERT INTO #table(field,otherField) VALUES (4,'b')
INSERT INTO #table(field,otherField) VALUES (6,'c')
INSERT INTO #table(field,otherField) VALUES (7,'d')
SELECT * FROM #table
returns
field | otherField
------------------
1 | a
4 | b
6 | c
7 | d
and
SELECT
(SELECT COUNT(*) FROM #table t2 WHERE t2.field <= t1.field) AS row_number,
field,
otherField
FROM #table t1
returns
row_number | field | otherField
-------------------------------
1 | 1 | a
2 | 4 | b
3 | 6 | c
4 | 7 | d
This is the solution without functions and stored procedures, but as I said there are the restrictions. But anyway, maybe it is enough for you.
RRUZ, you might be able to hide the use of a function by wrapping your query in a View. It would be transparent to the caller. I don't see any other options, besides the ones already mentioned.