delete duplicate rows from my table

delete duplicate rows from my table - sql

i need to delete all duplicate rows in my table - but leave only one row
MyTbl
====
Code | ID | Place | Qty | User
========================================
1 | 22 | 44 | 34 | 333
2 | 22 | 44 | 34 | 333
3 | 22 | 55 | 34 | 333
4 | 22 | 44 | 34 | 666
5 | 33 | 77 | 12 | 999
6 | 44 | 11 | 87 | 333
7 | 33 | 77 | 12 | 999
i need to see this:
Code | ID | Place | Qty | User
=======================================
1 | 22 | 44 | 34 | 333
3 | 22 | 55 | 34 | 333
4 | 22 | 44 | 34 | 666
5 | 33 | 77 | 12 | 999
6 | 44 | 11 | 87 | 333

In most databases, the fastest way to do this is:
select distinct t.*
into saved
from mytbl;
delete from mytbl;
insert into mytbl
select *
from saved;
The above syntax should work in Access. Other databases would use truncate table instead of delete.

Try this,
WITH CTEMyTbl (A,duplicateRecCount)
AS
(
SELECT id,ROW_NUMBER() OVER(PARTITION by id,place,qty,us ORDER BY id)
AS duplicateRecCount
FROM MyTbl
)
DELETE FROM CTEMyTbl
WHERE duplicateRecCount > 1

Related

How to reshape a table having multiple records for the same id into a table with one record per id without losing information?

Basically, I want to transform this(Initial) into this(Final). In other words, I want to
"squash" the initial table so that it will have only one record per id
"dilate" the initial table so that I won't lose any information: create a different column for every possible combination of source and column from the initial table (create c1_A, c1_B, ...).
I can work with the initial table as a csv in Python (maybe Pandas) and manually hardcode the mapping between the Initial and the Final table. However, I don't find this solution elegant at all and I'm much more interested in a sql / sas solution. Is there any way of doing that?
Edit: I what to change
+----+--------+------+-----+------+
| ID | source | c1 | c2 | c3 |
+----+--------+------+-----+------+
| 1 | A | 432 | 56 | 1 |
| 1 | B | 53 | 3 | 73 |
| 1 | C | 7 | 342 | 83 |
| 1 | D | 543 | 43 | 73 |
| 2 | A | 8 | 882 | 39 |
| 2 | B | 5 | 54 | 46 |
| 2 | C | 8 | 3 | 2226 |
| 2 | D | 87 | 2 | 45 |
| 3 | A | 93 | 143 | 45 |
| 3 | B | 1023 | 72 | 8 |
| 3 | C | 3 | 3 | 704 |
| 4 | A | 2 | 5 | 0 |
| 4 | B | 78 | 888 | 2 |
| 4 | C | 87 | 23 | 34 |
| 4 | D | 112 | 7 | 712 |
+----+--------+------+-----+------+
into
+----+------+------+------+------+------+------+------+------+------+------+------+------+
| ID | c1_A | c1_B | c1_C | c1_D | c2_A | c2_B | c2_C | c2_D | c3_A | c3_B | c3_C | c3_D |
+----+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 432 | 53 | 7 | 543 | 56 | 3 | 342 | 43 | 1 | 73 | 83 | 73 |
| 2 | 8 | 5 | 8 | 87 | 882 | 54 | 3 | 2 | 39 | 46 | 2226 | 45 |
| 3 | 93 | 1023 | 3 | | 143 | 72 | 3 | | 45 | 8 | 704 | |
| 4 | 2 | 78 | 87 | 112 | 5 | 888 | 23 | 7 | 0 | 2 | 34 | 712 |
+----+------+------+------+------+------+------+------+------+------+------+------+------+

Abandon hope ... ?
data want;
input
ID source $ c1 c2 c3;datalines;
1 A 432 56 1
1 B 53 3 73
1 C 7 342 83
1 D 543 43 73
2 A 8 882 39
2 B 5 54 46
2 C 8 3 2226
2 D 87 2 45
3 A 93 143 45
3 B 1023 72 8
3 C 3 3 704
4 A 2 5 0
4 B 78 888 2
4 C 87 23 34
4 D 112 7 712
;
* one to grow you oh data;
proc transpose data=want out=stage1;
by id source;
var c1-c3;
run;
* and one to shrink;
proc transpose data=stage1 out=want(drop=_name_) delim=_;
by id;
id _name_ source;
run;

CTE - recursive query doing too much

I have the current table of data...
| LoanRollupID | NewLoanID | PreviousLoanID |
|--------------|-----------|----------------|
| 11 | 76 | 44 |
| 12 | 80 | 75 |
| 13 | 83 | 82 |
| 14 | 84 | 83 |
| 15 | 86 | 85 |
| 16 | 87 | 54 |
| 17 | 88 | 87 |
| 18 | 90 | 48 |
| 19 | 91 | 34 |
| 20 | 93 | 41 |
| 21 | 94 | 76 |
| 22 | 95 | 90 |
| 23 | 96 | 94 |
| 24 | 100 | 92 |
| 25 | 101 | 99 |
| 26 | 102 | 98 |
| 27 | 103 | 101 |
| 28 | 104 | 81 |
| 29 | 105 | 80 |
| 30 | 107 | 52 |
| 31 | 110 | 108 |
| 1029 | 1105 | 103 |
| 1030 | 1106 | 104 |
| 1031 | 1108 | 1106 |
| 1032 | 1109 | 73 |
I'm trying to jump in at NewLoanID 1108 and see how it has evolved from previous Loans. e.g 1108 came from 1106, which came from 104, which came from 81, etc.
When I run this query:
WITH OldLoans (PreviousLoanID, NewLoanID, start)
AS
(
---- Anchor member definition
SELECT l.NewLoanID, l.PreviousLoanID, 0 as start
FROM dscs_public.LoanRollup l
Where NewLoanID = 1108
UNION ALL
-- Recursive member definition
SELECT l.NewLoanID, l.PreviousLoanID, start + 1
FROM dscs_public.LoanRollup l
INNER JOIN OldLoans AS o
ON o.NewLoanID = l.PreviousLoanID
)
---- Statement that executes the CTE
SELECT PreviousLoanID, NewLoanID, start
FROM OldLoans
It fails with this error:
The statement terminated. The maximum recursion 100 has been exhausted
before statement completion.
Can anyone spot my mistake please?
Thanks.

The aliases in the CTE definition are in the wrong order:
-- Instead of (PreviousLoanID, NewLoanID, start)
WITH OldLoans (NewLoanID, PreviousLoanID, start)
AS
(
---- Anchor member definition
SELECT l.NewLoanID, l.PreviousLoanID, 0 as start
FROM mytable l --LoanRollup l
Where NewLoanID = 1108
UNION ALL
-- Recursive member definition
SELECT l.NewLoanID, l.PreviousLoanID, start + 1
FROM mytable l --dscs_public.LoanRollup l
INNER JOIN OldLoans AS o
-- Instead of o.NewLoanID = l.PreviousLoanID
ON l.NewLoanID = o.PreviousLoanID
)
---- Statement that executes the CTE
SELECT PreviousLoanID, NewLoanID, start
FROM OldLoans
The same thing holds for the ON clause in the recursive member definition.

SQL - Calculating cell-value based on other cell-values

So I have run into a problem when working on some SQL coding.
I have a data table that looks somewhat like this:
ID TimeID IndicatorID Score
1 111 45 20
1 111 46 14
1 111 47 83
1 111 48 91
1 112 45 20
1 112 46 14
1 112 47 83
1 112 48 91
2 111 45 25
2 111 46 12
2 111 47 70
2 111 48 82
2 112 45 25
2 112 46 12
2 112 47 70
2 112 48 82
I want to add new rows containing values for indicator 240 and 241 where the score for indicator 240 is the score for indicator 45 / score for indicator 46 and similarly the score for indicator 241 is the score for indicator 47/ score for indicator 48. This has to be done for each TimeID for each ID.
The full table is huge as the number of IDs, TimeIDs for each ID, and IndicatorIDs for each TimeID is large.

Assuming your requirements are as stated, and all the IndicatorID values are hard-coded this can be done with some simply sub-queries and a straightforward INSERT statement:
insert into your_table
with yt as (
select * from your_table where IndicatorID in (45,46,47,48)
)
, yt45 as (select * from yt where IndicatorID = 45 )
, yt46 as (select * from yt where IndicatorID = 46 )
, yt47 as (select * from yt where IndicatorID = 47 )
, yt48 as (select * from yt where IndicatorID = 48 )
select yt45.id
, yt45.timeID
, 240 as IndicatorID
, yt45.score/yt46.score as score
from yt45
join yt46
on yt45.id = yt46.id
and yt45.timeID = yt46.timeID
union all
select yt47.id
, yt47.timeID
, 240 as IndicatorID
, yt47.score/yt48.score as score
from yt47
join yt48
on yt47.id = yt48.id
and yt47.timeID = yt48.timeID
/

This can be easily solved using MODEL clause.
SQL Fiddle
select id, timeid, indicatorid, score
from myt
model return updated rows
partition by (id, timeid)
dimension by (indicatorid)
measures(score)
rules(
score[240] = score[45]/score[46],
score[241] = score[47]/score[48]
);
Results:
| ID | TIMEID | INDICATORID | SCORE |
|----|--------|-------------|--------------------|
| 2 | 111 | 241 | 0.8536585365853658 |
| 2 | 111 | 240 | 2.0833333333333335 |
| 1 | 112 | 241 | 0.9120879120879121 |
| 1 | 112 | 240 | 1.4285714285714286 |
| 2 | 112 | 241 | 0.8536585365853658 |
| 2 | 112 | 240 | 2.0833333333333335 |
| 1 | 111 | 241 | 0.9120879120879121 |
| 1 | 111 | 240 | 1.4285714285714286 |
insert into myt
select id, timeid, indicatorid, score
from myt
model return updated rows
partition by (id, timeid)
dimension by (indicatorid)
measures(score)
rules(
score[240] = score[45]/score[46],
score[241] = score[47]/score[48]
);
Results:
select id, timeid, indicatorid, score
from myt
Results:
| ID | TIMEID | INDICATORID | SCORE |
|----|--------|-------------|--------------------|
| 1 | 111 | 45 | 20 |
| 1 | 111 | 46 | 14 |
| 1 | 111 | 47 | 83 |
| 1 | 111 | 48 | 91 |
| 1 | 111 | 240 | 1.4285714285714286 |
| 1 | 111 | 241 | 0.9120879120879121 |
| 1 | 112 | 45 | 20 |
| 1 | 112 | 46 | 14 |
| 1 | 112 | 47 | 83 |
| 1 | 112 | 48 | 91 |
| 1 | 112 | 240 | 1.4285714285714286 |
| 1 | 112 | 241 | 0.9120879120879121 |
| 2 | 111 | 45 | 25 |
| 2 | 111 | 46 | 12 |
| 2 | 111 | 47 | 70 |
| 2 | 111 | 48 | 82 |
| 2 | 111 | 240 | 2.0833333333333335 |
| 2 | 111 | 241 | 0.8536585365853658 |
| 2 | 112 | 45 | 25 |
| 2 | 112 | 46 | 12 |
| 2 | 112 | 47 | 70 |
| 2 | 112 | 48 | 82 |
| 2 | 112 | 240 | 2.0833333333333335 |
| 2 | 112 | 241 | 0.8536585365853658 |

SQL count occurrences of a value

I am trying to count the occurrences of a value in SQL
id | my_id | field_number | field_id | value
------------------------------------------------------------
1 | 101 | 78 | 88 | apple
2 | 287 | 76 | 55 | orange
3 | 893 | 45 | 33 | orange
4 | 922 | 23 | 33 | grape
5 | 198 | 09 | 88 | raisin
6 | 082 | 55 | 88 | apple
If I use the following then it correctly tells me that there are 3 field_id's with the value of 88.....
$count = $wpdb->get_results("SELECT COUNT(*) as count FROM wp_db1 WHERE field_id=88");
But if I try and do this:
$count = $wpdb->get_results("SELECT COUNT(*) as count FROM wp_db1 WHERE value=apple");
Then it does not work. Can anyone help?

You missed the quotes around apple:
$count = $wpdb->get_results('SELECT COUNT(*) as count FROM wp_db1 WHERE value="apple"');

How can I increment counter when the value in another column changes?

I have the following table
ID
12
12
25
25
78
78
78
And I need to be able to increment the counter value when the ID changes.
ID **COUNTER**
12 1
12 1
25 2
25 2
78 3
78 3
78 3
How can this be done? Is it even possible?

You can use dense_rank():
select id,
dense_rank() over(order by id) Counter
from yourtable
See SQL Fiddle with Demo
Result:
| ID | COUNTER |
----------------
| 12 | 1 |
| 12 | 1 |
| 25 | 2 |
| 25 | 2 |
| 78 | 3 |
| 78 | 3 |
| 78 | 3 |

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

delete duplicate rows from my table - sql

In most databases, the fastest way to do this is: select distinct t.* into saved from mytbl; delete from mytbl; insert into mytbl select * from saved; The above syntax should work in Access. Other databases would use truncate table instead of delete.

Try this, WITH CTEMyTbl (A,duplicateRecCount) AS ( SELECT id,ROW_NUMBER() OVER(PARTITION by id,place,qty,us ORDER BY id) AS duplicateRecCount FROM MyTbl ) DELETE FROM CTEMyTbl WHERE duplicateRecCount > 1

Related

How to reshape a table having multiple records for the same id into a table with one record per id without losing information?

CTE - recursive query doing too much

SQL - Calculating cell-value based on other cell-values

SQL count occurrences of a value

How can I increment counter when the value in another column changes?

Categories

Resources