Delete duplicate data that some columns equal zero - sql

I have SQL Server table that has col1, col2, col3, col4, col5, col6, col7, col8, col9, col10.
I want delete the duplicate based on col1, col2, col3.
The row that should be deleted is where col6=0 and col7=0 and col8=0.

We can use a deletable CTE here:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY col1, col2, col3) cnt
FROM yourTable
)
DELETE
FROM cte
WHERE cnt > 1 AND col6 = 0 AND col7 = 0 AND col8 = 0;
The CTE above identifies "duplicates" according to your definition, which is 2 or more records having the same values for col1, col2, and col3. Then we delete duplicates meeting the requirements on the other 3 columns.

Related

Slowness in SQL query with subquery

My SQL query becomes too slow when a subquery is added in WHERE clause even though the individual run times of the queries is less than 1 minute.
The query has the following skeleton
SELECT COL1, COL2, COL3, COL4, COL5, COL6, sum(COL7) FROM TABLE1
WHERE Col1 = 'something' AND COl2 = date AND Col3 = (SELECT MAX(COLUMN1) FROM TABLE2)
GROUP BY COL1, COL2, COL3, COL4, COL5, COL6
This query is running on SYBASE IQ.
Data for table 1 is 60M+ rows and post application of filter conditions is just 60 rows that usually takes 50 sec to run if subquery is replaced with hardcoded value.
Data for table 2 is 200 rows and post application of filter condition is just one integer value that individually takes 1 sec to run.
Move the subquery to the FROM clause:
SELECT t1.COL1, t1.COL2, t1.COL3, t1.COL4, t1.COL5, t1.COL6, sum(t1.COL7)
FROM TABLE1 t1 JOIN
(SELECT MAX(COLUMN1) as max_column1 FROM TABLE2) t2
ON t2.max_column1 = t1.date
WHERE Col1 = 'something' AND COl2 = date
GROUP BY COL1, COL2, COL3, COL4, COL5, COL6 ;
Then you want indexes on:
table1(col1, col2, date)
table2(column1).
You could move the subquery to the select and then use a join clause and see if that helps.
SELECT max(COLUMN1), COL1, COL2, COL3, COL4, COL5, COL6, sum(COL7)
FROM TABLE1 t1 join TABLE2 t2
on t1.Col3 = t2.COLUMN1
WHERE Col1 = 'something' AND COl2 = date
GROUP BY COLUMN1, COL1, COL2, COL3, COL4, COL5, COL6
Not that you are using parameters but it sounds weirdly like parameter sniffing.

oracle sql group by column which count them

I ran below query to fetch count of data which I can see in output but it does not working as I wish
How can I print count of col6 & col7 in output?
Am I clear?
select col1, col2,
col3, col4, decode(col5,'S','Success','F','Failed'), col6, col7, count(*)
from mytable
where col1 in (select FIELD1 from temp)
and col8 = 4
group by col1, col2, col3,col4,col5,col6,col7
See if this works
select count(col6), count(col7)
from mytable
where col1 in (select FIELD1 from temp)
and col8 = 4;
You need to use the proper aggregate function and remove the col6 and col7 from GROUP BY clause following query:
select col1, col2, col3, col4, decode(col5,'S','Success','F','Failed'),
count(col6), count(col7), count(*) -- used count for col6 and col7
from mytable
where col1 in (select FIELD1 from temp)
and col8 = 4
group by col1, col2, col3,col4,col5 -- removed col6 and col7 from here

SQL - Is column exclusion possible from 'SELECT' clause?

I face this question every time when I do a lot of complex processing and lot of columns SELECT ed in a sub-query but finally need to show only few.
Is there anyway SQL (Oracle or Microsoft or others) is thinking of having an (extra) clause to just ignore the columns not required.
;with t as (
select col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
from orders_tbl
where order_date > getdate() -- ex. T-sql
)
, s as (
select t.*, row_number() over(partition by col1 order by col8 DESC, col9) rn
from t
)
--
-- The problem is here: if i don't explicitly select the individual columns of "t" ,then it'll display the column "rn" as well which is not required.
--
select col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
from s where rn = 1
order by col1, col2
Now, imagine something like this -
with t as (
select col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
from orders_tbl
where order_date > getdate() -- ex. T-sql
)
, s as (
select t.*, row_number() over(partition by col1 order by col8 DESC, col9) rn
from t
)
--
-- Note: the imaginary clause "exclude"
--
select *
from s exclude (rn) where rn = 1
order by col1, col2
Your thoughts please?
It would be nice if MS Sql Server supported something like a SELECT * EXCEPT col FROM tbl like Google BigQuery.
But currently that functionality isn't (yet?) implemented in MS Sql Server.
However, one can simplify that SQL. And use only 1 CTE.
Since a TOP 1 WITH TIES can be combined with an ORDER BY ROW_NUMBER() OVER (...).
That way you don't have an RN column to exclude from the final result.
with T as (
select TOP 1 WITH TIES
col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
from orders_tbl
where order_date > getdate()
ORDER BY row_number() over(partition by col1 order by col8 DESC, col9)
)
select *
from T
order by col1, col2;
Note that the CTE is only needed here because the final result still has to be ordered by col1, col2.
Side-note One:
For simple queries selecting the required fields in the outer-query seems to be used more often.
with CTE as (
select *
, row_number() over(partition by col1 order by col8 DESC, col9) as rn
from orders_tbl
where order_date > getdate()
)
select col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
from CTE
where rn = 1
order by col1, col2;
Side-note Two:
I would love to see something like TeraData's QUALIFY clause added someday to the SQL Standard. It's a nice thing to have when there's a need to filter based on a window function like ROW_NUMBER or DENSE_RANK.
In TeraData that SQL could be golf-coded like this:
select col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
from orders_tbl
where order_date > current_timestamp
QUALIFY row_number() over(partition by col1 order by col8 DESC, col9) = 1
order by col1, col2
One way is to select into a new table and then drop the columns:
select col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, row_number() over(partition by col1 order by col8 DESC, col9) rn
into #a
from orders_tbl
where order_date > getdate()
alter table #a drop column col1
select * from #a
Note that this is not optimal in performance, as you've already read and then deleted some data. But it proves handy for few data and on-the-fly queries.

with clause in union query

I have with clause in union query like
with t1 as(...) ---common for both query
select * from t2
union
select * from t3
how to handle same with cluase in both queries?
You can reuse a Common Table Expression
For example:
with cte as
(
select col1, col2, col3, col4, col5, col6
from sometable
where col1 = 42
)
select col1, col2, col3
from cte as t1
union all
select col4, col5, col6
from cte as t2
If you need more CTE, then a comma can be used to separate them.
with cte1 as
(
select col1, col2, col3
from sometable
where col1 = 42
group by col1, col2, col3
)
, cte2 as
(
select col4, col5, col6
from sometable
where col4 > col5
group by col4, col5, col6
)
select col1, col2, col3
from cte1 as t1
union all
select col4, col5, col6
from cte2 as t2
But in this example it would be more something for aesthetic reasons, by putting the more complicated queries at the top of the SQL.
Because it would be more straightforward to just union the queries from the CTE's together.
select col1, col2, col3
from sometable
where col1 = 42
group by col1, col2, col3
union all
select col4, col5, col6
from sometable
where col4 > col5
group by col4, col5, col6

SQL inserting data from tabl1 to tabl2

I have 2 tables in SQL
table 1: col1, col2, col3, col4, col5, col6, col7
table 2: col1, col2, col4, col5, col10 (newcol)
col10 (newcol) should be given default value 0
I need to copy data from table 1 to table2
It is a pretty simple insert using a select. If you only need to do it once, you could skip the query and use the menu option for importing data, then just follow the prompts. Otherwise:
INSERT INTO table2
(col1, col2, col4, col5, col10)
SELECT col1, col2, col4, col5, 0
FROM table1;