Renumber duplicates to make them unique - sql

Table "public.t"
Column | Type | Modifiers
--------+---------+-----------
code | text |
grid | integer |
The codigo column, although of type text, has a numeric sequence which has
duplicates. The grid column is a unique sequence.
select * from t order by grid;
code | grid
------+------
1 | 1
1 | 2
1 | 3
2 | 4
2 | 5
2 | 6
3 | 7
The goal is to eliminate the duplicates in the code column to make it unique. The result should be similar to:
code | grid
------+------
1 | 1
6 | 2
4 | 3
2 | 4
7 | 5
5 | 6
3 | 7
The version is 8.2 (no window functions).
create table t (code text, grid integer);
insert into t values
('1',1),
('1',2),
('1',3),
('2',4),
('2',6),
('3',7),
('2',5);

This is the solution that worked.
drop sequence if exists s;
create temporary sequence s;
select setval('s', (select max(cast(code as integer)) m from t));
update t
set code = i
from (
select code, grid, nextval('s') i
from (
select code, max(grid) grid
from t
group by code
having count(*) > 1
order by grid
) q
) s
where
t.code = s.code
and
t.grid = s.grid
The problem with it is that the update command must be repeated until there are no more duplicates. It is just a "it is not perfect" problem as it is a one time operation only.

Export (and remove) everything but code column (maybe you could subquery for export, and remove just duplicated row). Make code primary with something such auto increment behaviour and reimport everything. code column should be automatically generated.

Related

How to create a column that increments in steps of 4 in Postgresql

I am trying to add a column to my table that increments in steps of four which would look like this:
1
1
1
1
2
2
2
2
3
3
3
3
etc.
I have been reading about CREATE SEQUENCE, but that does not seem to be what I need.
Does anyone have any suggestions how best to do this?
You could use row_number() and integer division:
select
t.*,
(3 + row_number() over(order by id)) / 4 rn
from mytable t
This assumes that you have an ordering column called id. I would not actually recommend storing this derived information. You can compute it on the fly, or put in a view.
You can still use a regular sequence for the default value, but do the following instead:
CREATE TABLE test (col1 int, col2 text);
CREATE SEQUENCE test_col1_seq OWNED BY test.col1;
ALTER TABLE test ALTER COLUMN col1 SET DEFAULT ceil(nextval('test_col1_seq')/4::numeric);
SELECT * FROM test;
col1 | col2
------+------
1 | a
1 | b
1 | c
1 | d
2 | e
2 | f
2 | g
2 | h
3 | i
(9 rows)
This just divides is by 4, and then rounds the value down.

SQL Insert when not exist Update if exist with mutliple rows in Target-Table

I have two tables, where table A has to be Updated or insert a row base on existing. I tried this by using JOINS EXCEPT and MERGE statement but I have one problem i can't solve. so here is an example :
Table A (Attribut-Table)
attr | attrValue | prodID
--------------------------
4 | 2 | 1
--------------------------
3 | 10 | 2
--------------------------
1 | 7 | 2
--------------------------
3 | 10 | 3
--------------------------
6 | 9 | 3
--------------------------
1 | 4 | 3
--------------------------
Table P(Product-Table)
prodID | stock |
------------------
1 | 1
------------------
2 | 0
------------------
3 | 1
------------------
4 | 1
------------------
Now what i would like to do the following in SQL:
All products, that has Stock > 0 should have an entry in Table A with attr = 6 and attrValue = 9
All products, that has Stock < 1 should have an entry in Table A with attr = 6 and attrValue = 8
i need a SQL Query to do that because my problem is that there are multiple entries for a prodID in Table A
That is what i am thinking of:
Fist check if any entry for the prodID(in Table B) exist in Table A, if not INSERT INTO Table A ( attr=6 and, attrValue = 8/9 (depends on Stock), prodID
If there is already an entry for the prodID in Table A with the attr = 6, then Update this row and set attrValue to 8/9 (depending on stock)
so I am looking for a translation of "my thoughts" to a sqlQuery
thanks for helping.
(using: SQL SERVER Express 2012 and HEIDI SQL for management)
Since your "attr 6" row is 100 % derivable from the state of the P table, it is a poor idea to store that row redundantly in A.
This is better :
(1) Define a first view ATTR6_FOR_P as SELECT prodID, 6 as attr, CASE (...) as attrValue from P. The CASE expression chooses the value 8 or 9 according to stock value in P.
(2) Define a second view A_EXT as A UNION ATTR6_FOR_P. (***)
Now changes in stock will always immediately be reflected in A_EXT without having to update explicitly.
(***) but beware of column ordering because SQL UNION does not match columns by name but by ordinal position instead.

TSQL SSRS Cross Reference another column

ID | Col2 | Col3 | SequenceNum
--------------------------------
1 | x | 12 | 5
2 | y | 11 | 6
3 | a | 45 | 7
100 | b | 23 | 8
101 | a | 16 | 9
102 | b | 28 | 10
4 | a | 9 | 11
5 | b | 26 | 12
6 | x | 100 | 13
I have an SSRS report at the moment which you can enter the ID for and it'll show you data for those ID's. For example lets say you enter start ID 2 end ID 5 it'll report back 2,3,4,5 with Col2 and Col3 data.
But what I really want to happen is for it to return 2,3,100,101,102,3,4,5
I believe may be some way to cross reference the SequenceNum column but I'm fairly new to SQL and SSRS can anyone help?
So an user would enter a parameters...
start-ID = 2 which has a SequenceNum of 6
and end-ID = 5 which has an SequenceNum of 12
Extract your starting and ending sequence numbers from value supplied by starting id and ending id respectively and use them in WHERE condition as below
DECLARE #StartingSeqNum INT, #EndingSeqNum
SELECT #StartingSeqNum = SeqNum FROM tableName WHERE ID = #start_id
SELECT #EndingSeqNum = SeqNum FROM tableName WHERE ID = #end_id
SELECT Col2,Col3
FROM tableName
WHERE SeqNum BETWEEN #StartingSeqNum AND #EndingSeqNum
As you are using SSRS you can specify a Value and a Label for your parameters.
Create a dataset with the following SQL as the source:
select distinct ID as Label
,SequenceNum as Value
from YourTable
order by SequenceNum
And then in the properties for your parameter, in Available Values select Get values from query and then select the above dataset. Set the Value field and Label field as your label and value columns and then click OK. You will need to do this for your start and end parameters, using the same dataset.
Your parameters will now be drop down menus that display the ID value to the user, but passes the SequenceNum value to your query. You can then use these to filter your main dataset.

sqlite string replace/delete

I have a column in my database table with name tags which contains comma separated strings and it has records like this-
index | tags
-------------
1 | a,b,c
2 | b
3 | c
4 | z
5 | b,a,c
6 | p,f,w
7 | a,c,b
(for simplicity i am denoting strings with characters)
Now i want to replace/delete particular string.
Delete - say I want to delete b from all rows. If tags column become empty after this operation that row/record should be deleted (index 2 in this case). My records should look like this after this operation.
index | tags
-------------
1 | a,c
3 | c
4 | z
5 | a,c
6 | p,f,w
7 | a,c
Replace - say I want to replace all a with k on original records
index | tags
-------------
1 | k,b,c
2 | b
3 | c
4 | z
5 | b,k,c
6 | p,f,w
7 | k,c,b
Question - I am thinking of using replace function somehow but not sure how to meet above requirement with that. Can i do this in a single sql command? If not please suggest best way to do this (may be multiple sql commands).
I use MSSQL, I'm not sure sqlite. But, you use REPLACE function, like this:
To remove b:
UPDATE Your_Table
SET tags = REPLACE(REPLACE(tags, ',b', ' '), 'b,', ' ')
UPDATE Your_Table
SET tags = NULL WHERE tags = 'b'
To replace a with k:
UPDATE Your_Table
SET tags = REPLACE(tags, 'a', 'k')

Finding the difference between two sets of data from the same table

My data looks like:
run | line | checksum | group
-----------------------------
1 | 3 | 123 | 1
1 | 7 | 123 | 1
1 | 4 | 123 | 2
1 | 5 | 124 | 2
2 | 3 | 123 | 1
2 | 7 | 123 | 1
2 | 4 | 124 | 2
2 | 4 | 124 | 2
and I need a query that returns me the new entries in run 2
run | line | checksum | group
-----------------------------
2 | 4 | 124 | 2
2 | 4 | 124 | 2
I tried several things, but I never got to a satisfying answer.
In this case I'm using H2, but of course I'm interested in a general explanation that would help me to wrap my head around the concept.
EDIT:
OK, it's my first post here so please forgive if I didn't state the question precisely enough.
Basically given two run values (r1, r2, with r2 > r1) I want to determine which rows having row = r2 have a different line, checksum or group from any row where row = r1.
select * from yourtable
where run = 2 and checksum = (select max(checksum)
from yourtable)
Assuming your last run will have the higher run value than others, below SQL will help
select * from table1 t1
where t1.run in
(select max(t2.run) table1 t2)
Update:
Above SQL may not give you the right rows because your requirement is not so clear. But the overall idea is to fetch the rows based on the latest run parameters.
SELECT line, checksum, group
FROM TableX
WHERE run = 2
EXCEPT
SELECT line, checksum, group
FROM TableX
WHERE run = 1
or (with slightly different result):
SELECT *
FROM TableX x
WHERE run = 2
AND NOT EXISTS
( SELECT *
FROM TableX x2
WHERE run = 1
AND x2.line = x.line
AND x2.checksum = x.checksum
AND x2.group = x.group
)
A slightly different approach:
select min(run) run, line, checksum, group
from mytable
where run in (1,2)
group by line, checksum, group
having count(*)=1 and min(run)=2
Incidentally, I assume that the "group" column in your table isn't actually called group - this is a reserved word in SQL and would need to be enclosed in double quotes (or backticks or square brackets, depending on which RDBMS you are using).