I have a "flat file" with structure as below:
machineCode,Key,Ip_Name_No,Share_Percent,Account_Name,Account_No
"ygh048GT",4767,534293748,"100.00","cderfgdsc Publishing International Ltd","160102040"
"xcd064HW",6380,65424090,"100.00","dascdfrgh snm skion","00090382478"
"000065AN",6402,65424090,"100.00","xcdertn,john sean","00090382478"
.....
The first row are the column headings. As can be seen, the fields are separated by a comma.
The requirement is to split the single string into separate fields.
This could be done by excel and then uploaded to a DB table using the data to columns option with comma as delimiter but the Account_Name field can contain commas within the values itself.
So, I came up with the below SQL. Question is, does this look correct ? Also, there must be some easier way to do this, any suggestions ?
WITH POS AS (
select
LOCATE_IN_STRING ( DATA , ',' , 2 ) - 1 AS TUNECODE_END ,
LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) - 1 AS WORKKEY_END,
LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) - 1 AS IPNN_END,
LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) - 1 AS PERC_END,
CASE WHEN
SUBSTR ( DATA ,
(
LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) + 1 ) ) + 1),
1) <> '"'
THEN
LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) + 1 ) ) + 1)) - 1
ELSE
LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) + 1 ) ) - 1
END AS ACNAME_END,
RRN(P) ROWN
FROM PLDWRK P
) SELECT
CAST ( SUBSTR( DATA , 1, TUNECODE_END ) AS CHAR(25))AS MACHINECODE ,
CAST ( SUBSTR( DATA , TUNECODE_END + 2 , WORKKEY_END - (TUNECODE_END + 1) ) AS DEC(12,0)) AS KEY,
CAST(SUBSTR( DATA , WORKKEY_END + 2, IPNN_END - (WORKKEY_END + 1) ) AS DEC(12, 0 )) AS IP_NN,
CAST (SUBSTR( DATA, IPNN_END + 2, PERC_END - (IPNN_END + 1)) AS CHAR(8))AS PERCENTAGE,
CAST (SUBSTR( DATA, PERC_END + 2, ACNAME_END - (PERC_END + 1)) AS CHAR(100)) AS ACCOUNT_NAME,
CAST (SUBSTR( DATA, ACNAME_END + 2 ) AS CHAR(30)) as ACCOUNT_NUMBER
FROM PLDWRK P JOIN POS ON ROWN = RRN(P)
It doesn't work properly for the last row, since one token contains , inside...
select
t.str
, regexp_substr (t.str, '[^,]+', 1, 1) as tok1
, regexp_substr (t.str, '[^,]+', 1, 2) as tok2
---, ...
, regexp_substr (t.str, '[^,]+', 1, 6) as tok6
from
(
values
(1, 'machineCode,Key,Ip_Name_No,Share_Percent,Account_Name,Account_No')
, (2, '"ygh048GT",4767,534293748,"100.00","cderfgdsc Publishing International Ltd","160102040"')
, (3, '"xcd064HW",6380,65424090,"100.00","dascdfrgh snm skion","00090382478"')
, (4, '"000065AN",6402,65424090,"100.00","xcdertn,john sean","00090382478"')
) t (id, str)
order by t.id
STR
TOK1
TOK2
TOK6
machineCode,Key,Ip_Name_No,Share_Percent,Account_Name,Account_No
machineCode
Key
Account_No
"ygh048GT",4767,534293748,"100.00","cderfgdsc Publishing International Ltd","160102040"
"ygh048GT"
4767
"160102040"
"xcd064HW",6380,65424090,"100.00","dascdfrgh snm skion","00090382478"
"xcd064HW"
6380
"00090382478"
"000065AN",6402,65424090,"100.00","xcdertn,john sean","00090382478"
"000065AN"
6402
john sean"
Lots of things are easier than raw SQL...
Why not simply use Copy From Import File (CPYFRMIMPF) that's what it's designed for.
CPYFRMIMPF FROMSTMF('/inbound/somedata.csv') TOFILE(MYLIB/MYTABLE) MBROPT(*REPLACE) RCDDLM(*CRLF) DTAFMT(*DLM) STRDLM(*DBLQUOTE) RMVCOLNAM(*YES)
You'll have to transfer the stream data into the IFS (where it really belongs) instead of a DB table.
IBM's Access Client Solutions (ACS) includes data transfer functionality that can understand .CSV files. This can be automated and can in fact run on either a PC or the IBM i itself.
Another great option would be an RPG program, back in 2008 Scott Klement wrote a CSV parser in RPG. He's since enhanced it to make it easier to use by taking advantage of RPG's DATA-INTO op-code.
Lastly it's 2023...node.js, PHP, Python are all available on the IBM i and all of them have libraries/packages to handle CSV and write to a DB table.
Related
I have a stuff function that concatenates multiple records and I put a line break after every second record and its works fine with this query:
STUFF((
SELECT CASE WHEN ROW_NUMBER() OVER (order by new_name) % 2 = 1 THEN CHAR(10) ELSE ',' END + new_name
FROM new_subcatagories
FOR XML PATH('')), 1, 1, '')
and the result is
Auditory,Kinesthetic vestibular
Multitasking,Planning & organization
Proprioception,Tactile
Vestibular tactile,Visual
But I want now to make this with a other column that I need to DISTINCT and I can't get it work my query is:
STUFF((
SELECT distinct (CASE WHEN ROW_NUMBER() OVER (order by new_maincatgoriesname) % 2 = 1 THEN CHAR(10) ELSE ',' END
+ new_maincatgoriesname)
FOR XML PATH('')), 1, 1, '')
and I get the result is in multiple not expected ways for example
Executive Function
Sensory Discrimination
Sensory modulation ,Multitasking,Sensory Discrimination,Sensory modulation
or other not expected ways, and I want the result to be
Executive Function,Sensory Discrimination
Sensory modulation,Multitasking
If someone can help my it will be really appreciated.
DISTINCT applies to the entire row so having an extra column populated with unneeded data (such as ROW_NUMBER()) would give invalid results.
To fix it you need to add another query nesting level.
DECLARE #Blah TABLE( new_maincatgoriesname VARCHAR( 200 ))
INSERT INTO #Blah
VALUES( 'Executive Function' ), ( 'Sensory Discrimination' ), ( 'Multitasking' ),
( 'Sensory Discrimination' ), ( 'Executive Function' ), ( 'Sensory modulation' )
SELECT
STUFF( CAST((
-- Step 2: manipulate result of Step 1
SELECT (CASE WHEN ROW_NUMBER() OVER (order by new_maincatgoriesname) % 2 = 1 THEN CHAR(10) ELSE ',' END + new_maincatgoriesname )
FROM
-- Step 1: Get distinct values
( SELECT DISTINCT new_maincatgoriesname
FROM #Blah ) AS MainQuery
FOR XML PATH('') ) AS VARCHAR( 2000 )), 1, 1, '' )
Output:
Executive Function,Multitasking
Sensory Discrimination,Sensory modulation
I have a column with data as given below -
I want to remove any [space] before and after the first instance of the '-' character in the data so that I can get the following cleansed data -
How to write this as a SQL Query ?
Try this one
CREATE TABLE Spaces(
Value VARCHAR(45)
);
INSERT INTO Spaces VALUES
('B2555 - 30...'),
('Babc30 - 40 ...'),
('B5- 50..'),
('B6AfG066ML -60..');
SELECT CASE WHEN CHARINDEX(' -', Value) > 0 THEN
STUFF(Value, CHARINDEX(' -', Value), 1, '')
ELSE
Value
End Result
FROM
(
SELECT CASE WHEN CHARINDEX('- ', Value) > 0 THEN
STUFF(Value, CHARINDEX('- ', Value) + 1, 1, '')
ELSE
Value
End Value
FROM
(
SELECT CASE WHEN CHARINDEX(' - ', Value) > 0 THEN
STUFF(Value, CHARINDEX(' - ', Value), 1, '')
ELSE
Value
End Value
FROM Spaces
) T1
) T2;
Returns:
+------------------------+
| Result |
+------------------------+
| B2555-30- ABC - ABC... |
| Babc30-40 ... |
| B5-50.. |
| B6AfG066ML-60.. |
+------------------------+
Demo
Here's a another option for you.
This is assuming the following:
Only remove lending or trailing space around the first instance of '-', all others are to be preserved.
Only accounts for 1 and only 1 leading or trailing space.
Could have already "cleaned" data.
Give this a try:
DECLARE #TestData TABLE
(
[StringData] NVARCHAR(100)
);
INSERT INTO #TestData (
[StringData]
)
VALUES ( 'ADFADSF- ASDFSADF - Q343243498' )
, ( 'ABC - EFSSADF - 2345234532' )
, ( 'EFGSADFSA -ASDFSADF - 2342345234' )
, ( 'ASDF34 - ASDLFASDJF - 234234 - 34324' )
, ( 'ABC-123 - 465 - 685' );
SELECT *
, STUFF([StringData]
, CHARINDEX('-', [StringData]) - 1
, 3
, REPLACE(SUBSTRING([StringData], CHARINDEX('-', [StringData]) - 1, 3), ' ', '')
) AS [CleanStringData]
FROM #TestData;
Basically what this does is strip 1 character before '-' to one after out, replacing that will those same character but with spaces removed if they exists.
I have to extract the next number out of given numbers. My table contains numbers like below. The main product is always with .1 at the end and could or not contains his subproducts e.g:
07.0001.1 (main product)
07.0001.2 (his sub)
07.0001.3 (his sub)
etc..
01.1453.1
01.1453.2
03.3456.1
03.3456.2
03.3456.3
03.5436.1
03.5436.2
03.5436.3
03.5436.4
12.7839.1
12.7839.2
12.3232.1
12.4444.1
12.4444.2
13.7676.1
i want to pass first to digits of a number to the query and based on that get all which starts with that and then get the highest number out of next four and return this number + 1.
So if we would take above example inputs if i say 12 then it should find this product: 12.7839.x and return 12.7839 + 1 so 12.7840
Another example if i say 03 then should find 03.5436 so 03.5436 + 1 so should return 03.5437
Hope you know what i mean.
I am not so familiar with SQL but this is how far i am:
select * from tbArtikel where Nummer LIKE '12.%'
This is another alternate for achieving the desired results. Providing the option to pass number to be queried. Consider following SQL statements
CREATE TABLE tblDummyExample
(
Number VARCHAR(64)
)
INSERT INTO tblDummyExample
VALUES ('07.0001.1')
, ('07.0001.2')
, ('07.0001.3')
, ('01.1453.1')
, ('01.1453.2')
, ('03.3456.1')
, ('03.3456.2')
, ('03.3456.3')
, ('03.5436.1')
, ('03.5436.2')
, ('03.5436.3')
, ('03.5436.4')
, ('12.7839.1')
, ('12.7839.2')
, ('12.3232.1')
, ('12.4444.1')
, ('12.4444.2')
, ('13.7676.1')
DECLARE #startWith VARCHAR(2) = '12' -- provide any number as input
SELECT #startWith + '.'+ CAST((MAX(CAST(SUBSTRING(ex.Number, (CHARINDEX('.', ex.Number, 1) + 1), (CHARINDEX('.', ex.Number, (CHARINDEX('.', ex.Number, 1) + 1)) - (CHARINDEX('.', ex.Number, 1) + 1))) AS INT)) + 1) AS VARCHAR(16))
FROM tblDummyExample ex
WHERE ex.Number LIKE #startWith+'%'
I'm sure, this solution is not restricted to any specific SQL Server version.
Try this, extract the first two parts, convert the 2nd to a numeric value, add one and convert back to a string again:
select
parsename(max(nummer), 3) + '.' -- 03
+ ltrim(max(cast(parsename(nummer, 2) as int) +1)) -- 5436 -> 5437
+ '.1'
from tbArtikel
where Nummer LIKE '03.%'
Try like this,
DECLARE #table TABLE (col VARCHAR(10))
INSERT INTO #table
VALUES ('01.1453.1')
,('01.1453.2')
,('03.3456.1')
,('03.3456.2')
,('03.3456.3')
,('03.5436.1')
,('03.5436.2')
,('03.5436.3')
,('03.5436.4')
,('12.7839.1')
,('12.7839.2')
,('12.3232.1')
,('12.4444.1')
,('12.4444.2')
,('13.7676.1')
SELECT TOP 1 left(col, charindex('.', col, 1) - 1) + '.' + convert(VARCHAR(10), convert(INT, substring(col, charindex('.', col, 1) + 1, charindex('.', col, charindex('.', col, 1) + 1) - (charindex('.', col, 1) + 1))) + 1)
FROM #table
WHERE col LIKE '03.%'
ORDER BY 1 DESC
I have a column called IP with data such as 10.001.99.108
I want to run a script to change it to look like 10.1.99.108
I have used this before:
update TABLE set IP = substring(IP, patindex('%[^0]%',IP), 10)
but that removes leading zeros at the begging.
Im not sure how I could change it to do the second segment.
You can do this with parsename() and a method to remove the leading zeros. The following removes the leading zeros by casting to an integer and then back to string:
select (cast(cast(parsename(ip, 4) as int) as varchar(255)) +
cast(cast(parsename(ip, 3) as int) as varchar(255)) +
cast(cast(parsename(ip, 2) as int) as varchar(255)) +
cast(cast(parsename(ip, 1) as int) as varchar(255))
)
try this solution
DECLARE #IpAdress AS TABLE ( IP VARCHAR(100) )
INSERT #IpAdress
( IP )
VALUES ( '10.001.99.108' ),
( '010.001.099.008' ),
( '080.081.999.008' );
WITH Tally
AS ( SELECT n = 1
UNION ALL
SELECT n + 1
FROM Tally
WHERE n <= 100
),
split
AS ( SELECT i.IP ,
CONVERT(INT, ( CASE WHEN CHARINDEX('.', S.string) > 0
THEN LEFT(S.string,
CHARINDEX('.', S.string)
- 1)
ELSE string
END )) AS part
FROM #IpAdress AS i
INNER JOIN Tally AS T ON SUBSTRING('.' + IP, T.N, 1) = '.'
AND T.N <= LEN(i.IP)
CROSS APPLY ( SELECT String = ( CASE
WHEN T.N = 1
THEN LEFT(i.IP,
CHARINDEX('.',
i.IP) - 1)
ELSE SUBSTRING(i.IP,
T.N, 1000)
END )
) S
)
SELECT DISTINCT
o.ip ,
SUBSTRING(( SELECT '.' + CONVERT(VARCHAR, i.part)
FROM split AS i
WHERE i.ip = o.ip
FOR
XML PATH('')
), 2, 1000) AS newIP
FROM split AS o
output result
ip newIP
010.001.099.008 10.1.99.8
080.081.999.008 80.81.999.8
10.001.99.108 10.1.99.108
i have a table like following
RequestNo Facility status
1 BDC1 Active
1 BDC2 Active
1 BDC3 Active
2 BDC1 Active
2 BDC2 Active
i want like this
RequestNo Facilty Count
1 BDC (1,2,3) 1
2 BDC(1,2) 1
the count should display based on Status with facilty.Fcilityv should take as BDC only
Try this, (assuming that your facility is fixed 4 character code)
SELECT RequestNo, Fname + '(' + FnoList + ')' Facilty, count(*) cnt
FROM
(
SELECT distinct RequestNo,
SUBSTRING(Facility,1,3) Fname,
stuff((
select ',' + SUBSTRING(Facility,4,4)
from Dummy
where RequestNo = A.RequestNo AND
SUBSTRING(Facility,1,3) = SUBSTRING(A.Facility,1,3)
for xml path('')
) ,
1, 1, '') as FnoList
FROM Dummy A
) x
group by RequestNo, Fname, FnoList;
SQL DEMO
This doesn't put any constraints on the length of the Facility field. It strips out the chars from the beginning and the numeric numbers from the ending:
SELECT RequestNo, FacNameNumbers, COUNT(Status) as StatusCount
FROM
(
SELECT DISTINCT
t1.RequestNo,
t1.Status,
substring(facility, 1, patindex('%[^a-zA-Z ]%',facility) - 1) +
'(' +
STUFF((
SELECT DISTINCT ', ' + t2.fac_number
FROM (
select distinct
requestno,
substring(facility, 2 + len(facility) - patindex('%[^0-9 ]%',reverse(facility)), 9999) as fac_number
from facility
) t2
WHERE t2.RequestNo = t1.RequestNo
FOR XML PATH (''))
,1,2,'') + ')' AS FacNameNumbers
FROM Facility t1
) final
GROUP BY RequestNo, FacNameNumbers
And the SQL Fiddle