Reading 2 rows together from SQL Server table - sql

I've got a table which has the following data:
Record Type MID CvvId Date Amount
------------- ---------- ------------- ------------- ---------
2E 8715613516 2014-25-03 27.12
3E asd5156154485 M
2E 8751651650 2014-25-03 27.13
3E asd5165434485 S
No I have to read the values of this table and insert it another table, having columns as:
MID, CvvId, Date, Amount.
Now for every record of Record Type "2E", the row just after that would be always of record type "3E"
For columns MID, CvvId, Date, Amount I have to use the record type "2E". For CvvId, I have to use the record type "3E". After every 2E record, I would get a 3E record. I have to write a stored procedure in SQL Server to insert this data into another table. How can this be achieved?

I changed your table by adding a transactionId column which is unique for both 2E and 3E records as follows
create table tableE(
TId int,
rtype varchar(20),
MID varchar(20),
CvvId varchar(20),
date date,
amount decimal(18,2))
Then following SELECT statement will help you combine them into a single row
select *
from tableE t1
inner join tableE t2 on t1.tid = t2.tid
where t1.rtype = '2E' and t2.rtype = '3E'

Here is a process for doing this using bulk insert. As suggested in another answer, you want to change the definition of the underlying table to have an identity column:
create table atable (
atableId int not null identity(1, 1) primary key,
rtype varchar(20),
MID varchar(20),
CvvId varchar(20),
date date,
amount decimal(18,2)
);
Then, define a view on this table that excludes the identity:
create view v_atable as
select rtype, mid, cvvid, date, amount
from atable;
Now, use bulk insert into the view. This will generate an auto-incrementing identity value for each row. You then have an ordering and can identify the next row.

I got it to work. As recommended above, added a new auto-increment (not necessarily P.k) field as Tid. and wrote the following query: -
SELECT p.MID, LEAD(p.CvvId) OVER (ORDER BY p.Tid) NextCvv,
LEAD(p.RecordType) OVER (ORDER BY p.Tid) NextRecordType
FROM table p
after filtering through the record set, on some business conditions, it worked. Thanks all for help!.

Related

SQL stored procedure to determine age and changing the value of a row in another table depending upon the age

This is a continuation of the previous question here.
So I have a table named GenericAttribute which has some values like this:
Id
KeyGroup
Key
Value
28
Customer
DateOfBirth
26-01-2000
29
Customer
DateOfBirth
26-01-2020
30
Customer
CountryPage.HideStatesBlock
FALSE
I have another table named RoleMapper that maps a customer based on their ID to their role ID. The Id in the GenericAttribute is the Foreign Key which originates from the CustomerID column of the RoleMapper table, below.
CustomerID
CustomerRoleId
28
58
29
27
My intention is to create a SQL agent job with a stored procedure that updates the RoleMapper table value to 24, if a customer's age is more than 60, today. The trigger must be activated once a day.
I am using SQL Server.
I tried using this query based on the answer given in my previous question.
select [id] from [Genericattribute]
where [key] = 'DateOfBirth'
and right(value,5)=format(getdate(),'MM-dd')
Though I was able to get an answer to whose birthday was today, when more than one people had their birthdays on the same day, I was unable to proceed even after using a table data type.
Try Schedule a SQL server agent job with the below query
DECLARE #Today Date=GETDATE()
;WITH CTE
AS
(
SELECT *,RIGHT(value,4)+'-'+SUBSTRING(value,4,2)+'-'+LEFT(value,2)[Date]
FROM GenericAttribute
WHERE [key] = 'DateOfBirth'
)
UPDATE RoleMapper
SET CustomerRoleId=24
FROM RoleMapper RoleMapper
JOIN CTE
ON CTE.ID =RoleMapper.CustomerID
WHERE DATEDIFF(YEAR,[Date],#Today)>60
OR (DATEDIFF(YEAR,[Date],#Today)=60 AND MONTH(#Today)>=MONTH([Date]) AND DAY(#Today)>=DAY([Date]))

inserting results of different queries of different tables into a new created table in Postgres via PGadmin4

So i have created a table with multiple columns to collect some information about a database
CREATE TABLE DATENBANKEN (
ID serial,
name VARCHAR(20),
Erstellt timestamp,
Datenbankgröße VARCHAR(20),
"Collation" VARCHAR (20),
Beschreibung VARCHAR (50)
)
and with the following insert statement i was able to fill the rows with the desired information
INSERT INTO DATENBANKEN (id, name, Erstellt, Datenbankgröße, "Collation")
SELECT pg_database.oid,
pg_database.datname,
(pg_stat_file('base/'||pg_database.oid ||'/PG_VERSION')).modification,
pg_size_pretty(pg_database_size(datname)),
pg_database.datcollate datcollate from pg_database
this is the results
all the values above were captured from on table (pg_database)
now the last value "Beschreibung" is located in another table named (pg_shdescription)
so in this case i had to make another insert statement specifically for the column "Beschreibung"
INSERT INTO DATENBANKEN (Beschreibung)
select pg_shdescription.description from pg_shdescription
as you can see the rows in the column "Beschreibung" were not inserted beside the first three rows as i expected, but were added as additional rows with no connection to the data above.
this is the table pg_shdescription and as you can see, for every objoid there is a specific description. So 1 is "default template for new databases"
so here the 4th row in the column "Beschreibung" should have been in the second row where the datacenter name "template 1 is"
what did i do wrong here or what is the best way to insert certain data from different tables into a new table that are still linked together?
I really appreciate your help, any help 🙂
i tried INNER JOIN in the statement, but it did not work
CREATE TABLE DATENBANKEN (
ID serial,
name VARCHAR(20),
Erstellt timestamp,
Datenbankgröße VARCHAR(20),
"Collation" VARCHAR (20),
Beschreibung VARCHAR (50)
)
INSERT INTO DATENBANKEN (id, name, Erstellt, Datenbankgröße, "Collation")
SELECT pg_database.oid,
pg_database.datname,
(pg_stat_file('base/'||pg_database.oid ||'/PG_VERSION')).modification,
pg_size_pretty(pg_database_size(datname)),
pg_database.datcollate datcollate from pg_database
INSERT INTO DATENBANKEN (Beschreibung)
select pg_shdescription.description from pg_shdescription
INNER JOIN on Datenbanken.id = pg_shdescription.objoid
you should join the tables before you insert, else you need to UPDATE DATENbBANKEN and not INSERT INTO`
INSERT INTO DATENBANKEN (id, name, Erstellt, Datenbankgröße, "Collation",Beschreibung )
SELECT pg_database.oid,
pg_database.datname,
(pg_stat_file('base/'||pg_database.oid ||'/PG_VERSION')).modification,
pg_size_pretty(pg_database_size(datname)),
pg_database.datcollate datcollate,
pg_shdescription.description
from pg_database JOIN pg_shdescription
ON pg_database.oid = pg_shdescription.objoid

Counting points a on linestring

I am trying to counts the number of points on a line for each row in the following table
CREATE TABLE outils.prod(
pk INTEGER PRIMARY KEY,
cable VARCHAR (25),
PA VARCHAR (10),
Art VARCHAR(7),
FT Numeric,
BT Numeric
);
INSERT INTO outils.prod (pk)
SELECT id_ftth
FROM outils.cable
WHERE type_cable = '2' ;
SELECT ADDGEOMETRYCOLUMN('outils','prod','geom',2154,'MultiLineString',2);
I have tried to update my line table but i have trouble getting an answer for each row.
UPDATE outils.prod SET FT=(SELECT COUNT( ST_INTERSECTION(outils.prod.geom,outils.ft.geom))
FROM outils.prod , outils.ft)
With the above code i managed to get the total number of intersection for every line but i would like to have the count by line in my line table.
Thank you ,
Hugo
You would have to write a sub-query to do the count per line.
Also you don't need to compute the intersection (the geom), but just to check if they intersect, which is much faster.
UPDATE outils.prod
SET FT= sub.cnt
FROM (
SELECT pk, count(*) as cnt
FROM outils.ft
JOIN outils.prod ON ST_INTERSECTS(prod.geom, ft.geom)
)
WHERE prod.pk = sub.pk;

Split one large, denormalized table into a normalized database

I have a large (5 million row, 300+ column) csv file I need to import into a staging table in SQL Server, then run a script to split each row up and insert data into the relevant tables in a normalized db. The format of the source table looks something like this:
(fName, lName, licenseNumber1, licenseIssuer1, licenseNumber2, licenseIssuer2..., specialtyName1, specialtyState1, specialtyName2, specialtyState2..., identifier1, identifier2...)
There are 50 licenseNumber/licenseIssuer columns, 15 specialtyName/specialtyState columns, and 15 identifier columns. There is always at least one of each of those, but the remaining 49 or 14 could be null. The first identifier is unique, but is not used as the primary key of the Person in our schema.
My database schema looks like this
People(ID int Identity(1,1))
Names(ID int, personID int, lName varchar, fName varchar)
Licenses(ID int, personID int, number varchar, issuer varchar)
Specialties(ID int, personID int, name varchar, state varchar)
Identifiers(ID int, personID int, value)
The database will already be populated with some People before adding the new ones from the csv.
What is the best way to approach this?
I have tried iterating over the staging table one row at a time with select top 1:
WHILE EXISTS (Select top 1 * from staging)
BEGIN
INSERT INTO People Default Values
SET #LastInsertedID = SCOPE_IDENTITY() -- might use the output clause to get this instead
INSERT INTO Names (personID, lName, fName)
SELECT top 1 #LastInsertedID, lName, fName from staging
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber1, licenseIssuer1 from staging
IF (select top 1 licenseNumber2 from staging) is not null
BEGIN
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber2, licenseIssuer2 from staging
END
-- Repeat the above 49 times, etc...
DELETE top 1 from staging
END
One problem with this approach is that it is prohibitively slow, so I refactored it to use a cursor. This works and is significantly faster, but has me declaring 300+ variables for Fetch INTO.
Is there a set-based approach that would work here? That would be preferable, as I understand that cursors are frowned upon, but I'm not sure how to get the identity from the INSERT into the People table for use as a foreign key in the others without going row-by-row from the staging table.
Also, how could I avoid copy and pasting the insert into the Licenses table? With a cursor approach I could try:
FETCH INTO ...#LicenseNumber1, #LicenseIssuer1, #LicenseNumber2, #LicenseIssuer2...
INSERT INTO #LicenseTemp (number, issuer) Values
(#LicenseNumber1, #LicenseIssuer1),
(#LicenseNumber2, #LicenseIssuer2),
... Repeat 48 more times...
.
.
.
INSERT INTO Licenses(personID, number, issuer)
SELECT #LastInsertedID, number, issuer
FROM #LicenseTEMP
WHERE number is not null
There still seems to be some redundant copy and pasting there, though.
To summarize the questions, I'm looking for idiomatic approaches to:
Break up one large staging table into a set of normalized tables, retrieving the Primary Key/identity from one table and using it as the foreign key in the others
Insert multiple rows into the normalized tables that come from many repeated columns in the staging table with less boilerplate/copy and paste (Licenses and Specialties above)
Short of discreet answers, I'd also be very happy with pointers towards resources and references that could assist me in figuring this out.
Ok, I'm not an SQL Server expert, but here's the "strategy" I would suggest.
Calculate the personId on the staging table
As #Shnugo suggested before me, calculating the personId in the staging table will ease the next steps
Use a sequence for the personID
From SQL Server 2012 you can define sequences. If you use it for every person insert, you'll never risk an overlapping of IDs. If you have (as it seems) personId that were loaded before the sequence you can create the sequence with the first free personID as starting value
Create a numbers table
Create an utility table keeping numbers from 1 to n (you need n to be at least 50.. you can look at this question for some implementations)
Use set logic to do the insert
I'd avoid cursor and row-by-row logic: you are right that it is better to limit the number of accesses to the table, but I'd say that you should strive to limit it to one access for target table.
You could proceed like these:
People:
INSERT INTO People (personID)
SELECT personId from staging;
Names:
INSERT INTO Names (personID, lName, fName)
SELECT personId, lName, fName from staging;
Licenses:
here we'll need the Number table
INSERT INTO Licenses (personId, number, issuer)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then licenseNumber1
when 2 then licenseNumber2
...
when 50 then licenseNumber50
end as licenseNumber,
case nbrs.n
when 1 then licenseIssuer1
when 2 then licenseIssuer2
...
when 50 then licenseIssuer50
end as licenseIssuer
from staging
cross join
(select n from numbers where n>=1 and n<=50) nbrs
) WHERE licenseNumber is not null;
Specialties:
INSERT INTO Specialties(personId, name, state)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then specialtyName1
when 2 then specialtyName2
...
when 15 then specialtyName15
end as specialtyName,
case nbrs.n
when 1 then specialtyState1
when 2 then specialtyState2
...
when 15 then specialtyState15
end as specialtyState
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE specialtyName is not null;
Identifiers:
INSERT INTO Identifiers(personId, value)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then identifier1
when 2 then identifier2
...
when 15 then identifier15
end as value
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE value is not null;
Hope it helps.
You say: but the staging table could be modified
I would
add a PersonID INT NOT NULL column and fill it with DENSE_RANK() OVER(ORDER BY fname,lname)
add an index to this PersonID
use this ID in combination with GROUP BY to fill your People table
do the same with your names table
And then use this ID for a set-based insert into your three side tables
Do it like this
SELECT AllTogether.PersonID, AllTogether.TheValue
FROM
(
SELECT PersonID,SomeValue1 AS TheValue FROM StagingTable
UNION ALL SELECT PersonID,SomeValue2 FROM StagingTable
UNION ALL ...
) AS AllTogether
WHERE AllTogether.TheValue IS NOT NULL
UPDATE
You say: might cause a conflict with IDs that already exist in the People table
You did not tell anything about existing People...
Is there any sure and unique mark to identify them? Use a simple
UPDATE StagingTable SET PersonID=xyz WHERE ...
to set existing PersonIDs into your staging table and then use something like
UPDATE StagingTable
SET PersonID=DENSE RANK() OVER(...) + MaxExistingID
WHERE PersonID IS NULL
to set new IDs for PersonIDs still being NULL.

Unique constraint on Distinct select in Oracle database

I have a data processor that would create a table from a select query.
<_config:table definition="CREATE TABLE TEMP_TABLE (PRODUCT_ID NUMBER NOT NULL, STORE NUMBER NOT NULL, USD NUMBER(20, 5),
CAD NUMBER(20, 5), Description varchar(5), ITEM_ID VARCHAR(256), PRIMARY KEY (ITEM_ID))" name="TEMP_TABLE"/>
and the select query is
<_config:query sql="SELECT DISTINCT ce.PRODUCT_ID, ce.STORE, op.USD ,op.CAD, o.Description, ce.ITEM_ID
FROM PRICE op, PRODUCT ce, STORE ex, OFFER o, SALE t
where op.ITEM_ID = ce.ITEM_ID and ce.STORE = ex.STORE
and ce.PRODUCT_ID = o.PRODUCT_ID and o.SALE_ID IN (2345,1234,3456) and t.MEMBER = ce.MEMBER"/>
When I run that processor, I get an unique constraint error, though I have a distinct in my select statement.
I tried with CREATE TABLE AS (SELECT .....) its creating fine.
Is it possible to get that error? I'm doing a batch execute so not able to find the individual record.
The select distinct applies to the entire row, not to each column individually. So, two rows could have the same value of item_id but be different in the other columns.
The ultimate fix might be to have a group by item_id in the query, instead of select distinct. That would require other changes to the logic. Another possibility would be to use row_number() in a subquery and select the first row.