Split String into multiple columns in Informatica powercenter - sql

my goal is to take querypaths from an excel sheet, split it and insert data into teradata table. . How can I do it?
Here an example of the scennario:
QUERYPATH:
/content/folder[#name='AAAAA AAAA']/folder[#name='A.B.C.']/folder[#name='AreaA']/folder[#name='Sub Area ABC']/folder[#name='GroupBB']/analysis[#name='Final elementJK']
/content/folder[#name='AAAAA AAAA']/folder[#name='A.B.C.']/folder[#name='AreaB']/folder[#name='Sub Area A.B.C.']/report[#name='Final elementHJ']
/content/folder[#name='AAAAA AAAA']/folder[#name='A.B.C.']/folder[#name='AreaC']/folder[#name='Sub BCD']/analysis[#name='Final elementFG']
id A| AAAAAAAAAA |idArea|Area |idSubArea| SubArea |idGroup | Group | Final Element |
112| AAAAAAAAAA | 22 |AreaC | 221 | Sub BCD | 2216 | GroupA | Final elementFG |
112| BDHDSKDDDD | 39 |AreaA | 393 | Sub ABC | 3931 | GroupBB | Final elementJK |
112| AAAAAAAAAA | 22 |AreaC | 222 | Sub BCD | 2217 |Final ElementLL| Final elementLL |
112| EEEEEEEEEE | 11 |AreaB | 114 |Sub A.B.C.| 1142 |Final elementHJ| Final elementHJ |
There's always an Area and SubArea Value, Group and FinalElement usually are "new values" and also usually there is no a group value, so in this case I copy the FinalElement valu (example: Final elementLL, Final elementHJ )

There is no split function in PowerCenter. You'll need to use a combination of INSTR and SUBSTR functions to extract the appropriate values.
Please also share (apart from the problem definition) your attempts on solving it and results you have. It will make it easier to help you achieve the results.

Related

Cumulative SUM in a query (SQL access)

Using MS access SQL I have a query (actually a UNION made of multiple queries) and need a cumulative sum (actually a statement of account which items are in chronological order).
How do I get a cumulative sum?
Since they are duplicates by date I have to add a new ID, however, SQL in MS access does not seem to have ROW_ID or similar.
So, we need to sort donation data into chronological order across multiple tables with duplicates. First combine all the tables of donators in one query which sets up the simplest syntax. Then to put things in order we need to have an order for the duplicate dates. The dataset has two natural ways to sort duplicate dates including the donator and the amount. For instance, we could decide that after the date bigger donations come first, If the rule is complicated enough we abstract it to a code module and into public function and include it in the query so that we can sort by it:
'Sorted Donations:'
SELECT (BestDonator(q.donator)) as BestDonator, *
FROM tblCountries as q
UNION SELECT (BestDonator(j.donator)) as BestDonator, *
FROM tblIndividuals as j
ORDER BY EvDate Asc, Amount DESC , BestDonator DESC;
Public Function BestDonator(donator As String) As Long
BestDonator = Len(donator) 'longer names are better :)'
End Function
with sorted donations we have settled on an order for the duplicate dates and have combined both individual donations and country donations, so now we can calculate the running sum directly using either dsum or a subquery. There is no need to calculate row id. The tricky part is getting the syntax correct. I ended up abstracting the running sum calculation to a function and omitting BestDonator because I couldn't easily paste together this query in the query designer and I ran out of time to bug fix
Public Function RunningSum(EvDate As Date, Amount As Currency)
RunningSum = DSum("Amount", "Sorted Donations", "(EvDate < #" & [EvDate] & "#) OR (EvDate = #" & [EvDate] & "# AND Amount >= " & [Amount] & ")")
End Function
Carefully note the OR in the Dsum part of the RunningSum calculation. This is the tricky part to summing the right amounts.
'output
-------------------------------------------------------------------------------------
| donator | EvDate | Amount | RunningSum |
-------------------------------------------------------------------------------------
| Reiny | 1/10/2020 | 321 | 321 |
-------------------------------------------------------------------------------------
| Czechia | 3/1/2020 | 7455 | 7776 |
-------------------------------------------------------------------------------------
| Germany | 3/18/2020 | 4222 | 11998 |
-------------------------------------------------------------------------------------
| Jim | 3/18/2020 | 222 | 12220 |
-------------------------------------------------------------------------------------
| Australien | 4/15/2020 | 13423 | 25643 |
-------------------------------------------------------------------------------------
| Mike | 5/31/2020 | 345 | 25988 |
-------------------------------------------------------------------------------------
| Portugal | 6/6/2020 | 8755 | 34743 |
-------------------------------------------------------------------------------------
| Slovakia | 8/31/2020 | 3455 | 38198 |
-------------------------------------------------------------------------------------
| Steve | 9/6/2020 | 875 | 39073 |
-------------------------------------------------------------------------------------
| Japan | 10/10/2020 | 5234 | 44307 |
-------------------------------------------------------------------------------------
| John | 10/11/2020 | 465 | 44772 |
-------------------------------------------------------------------------------------
| Slowenia | 11/11/2020 | 4665 | 49437 |
-------------------------------------------------------------------------------------
| Spain | 11/22/2020 | 7677 | 57114 |
-------------------------------------------------------------------------------------
| Austria | 11/22/2020 | 3221 | 60335 |
-------------------------------------------------------------------------------------
| Bill | 11/22/2020 | 767 | 61102 |
-------------------------------------------------------------------------------------
| Bert | 12/1/2020 | 755 | 61857 |
-------------------------------------------------------------------------------------
| Hungaria | 12/24/2020 | 9996 | 71853 |
-------------------------------------------------------------------------------------

Filtering Query Results to keep certain Template but when that template doesn't exist still have an entry

I have 3 Tables A,B,C Which I wish to join together, removing duplicate values from a field in A but giving preference to a certain value in C.
My Tables are as follows.
A
+--------------+--------------+-----------------+
| Installation | Substructure | Description |
+--------------+--------------+-----------------+
| A | 12 | non-unique text |
+--------------+--------------+-----------------+
| A | 22 | Non-unique text |
+--------------+--------------+-----------------+
| B | 54 | Non-unique text |
+--------------+--------------+-----------------+
This if Left Joined with table B on the Substructure
+--------------+-----------+
| Substructure | Reference |
+--------------+-----------+
| 12 | REF001 |
+--------------+-----------+
| 12 | REF002 |
+--------------+-----------+
| 12 | REF003 |
+--------------+-----------+
| 22 | REF004 |
+--------------+-----------+
| 22 | REF005 |
+--------------+-----------+
| 54 | REF006 |
+--------------+-----------+
| 54 | REF007 |
+--------------+-----------+
| 54 | REF008 |
+--------------+-----------+
This is further Right joined with Table C on the Reference
+-----------+-----------------+---------------+
| Reference | Description | Template_Type |
+-----------+-----------------+---------------+
| REF001 | Some Text | PNID |
+-----------+-----------------+---------------+
| REF002 | More Text | ISO |
+-----------+-----------------+---------------+
| REF003 | Non-Unique Text | Phot |
+-----------+-----------------+---------------+
The current form of the code is something like
SELECT DISTINCT
A.Substructure,
A.Description,
B.Reference,
C.Description AS REF_DES
FROM A
LEFT JOIN B ON (A.SUBSTRUCTURE = B.SUBSTRUCTURE)
RIGHT JOIN C ON (B.REFERENCE = C.REFERENCE)
Which works and returns every Template_Type , Reference associated with a given Substructure. However what I'd like to do now is remove the duplicate substructure entries from the returned query, keeping those that have the Template Type as PNID but if that substructure does not have an entry with a PNID I'd still like to have an entry for that substructure returned. If there's no document entry at all then I'd also like an entry for that sub-structure returned.
I tried using various WHERE conditions to filter the results further but obviously filtering on TEMPLATE_TYPE = value will exclude all the substructures that do not have PNIDS.
Unfortunately I have no control over how the data is stored in the tables.
The solution to this was to run a sub-query filtering Table C for PNIDs and then joining against the query results.

postgres: Multiply column of table A with rows of table B

Fellow SOers,
Currently I am stuck with the following Problem.
Say we have table "data" and table "factor"
"data":
---------------------
| col1 | col2 |
----------------------
| foo | 2 |
| bar | 3 |
----------------------
and table "factor" (the amount of rows is variable)
---------------------
| name | val |
---------------------
| f1 | 7 |
| f2 | 8 |
| f3 | 9 |
| ... | ... |
---------------------
and the following result should look like this:
---------------------------------
| col1 | f1 | f2 | f3 | ...|
---------------------------------
| foo | 14 | 16 | 18 | ...|
| bar | 21 | 24 | 27 | ...|
---------------------------------
So basically I want the column "col2" multiplicated with all the contents of "val" of table "factor" AND the content of column "name" should act as tableheader/columnname for the result.
We are using postgres 9.3 (upgrade to higher version may be possible), so an extended Search resulted in multiple possible solutions: using crosstab (though even with crosstab I was not able to figure this one out), using CTE "With" (preferred, but also no luck). Probably this may also be done with the correct use of array() and unnest().
Hence, any help is appreciated on how to achieve this (the less code, the better)
Tnx in advance!
This package seems to do what you want:
https://github.com/hnsl/colpivot

Unique string table in SQL and replacing index values with string values during query

I'm working on an old SQL Server database that has several tables that look like the following:
|-------------|-----------|-------|------------|------------|-----|
| MachineName | AlarmName | Event | AlarmValue | SampleTime | ... |
|-------------|-----------|-------|------------|------------|-----|
| 3 | 180 | 8 | 6.780 | 2014-02-24 | |
| 9 | 67 | 8 | 1.45 | 2014-02-25 | |
| ... | | | | | |
|-------------|-----------|-------|------------|------------|-----|
There is a separate table in the database that only contains unique strings, as well as the index for each unique string. The unique string table looks like this:
|----------|--------------------------------|
| Id | String |
|----------|--------------------------------|
| 3 | MyMachine |
| ... | |
| 8 | High CPU Usage |
| ... | |
| 67 | 404 Error |
| ... | |
|----------|--------------------------------|
Thus, when we want to get something out of the database, we get the respective rows out, then lookup each missing string based on the index value.
What I'm hoping to do is to replace all of the string indexes with the actual values in a single query without having to do post-processing on the query result.
However, I can't figure out how to do this in a single query. Do I need to use multiple JOINs? I've only been able to figure out how to replace a single value by doing something like -
SELECT UniqueString.String AS "MachineName" FROM UniqueString
JOIN Alarm ON Alarm.MachineName = UniqueString.Id
Any help would be much appreciated!
Yes, you can do multiple joins to the UniqueStrings table, but change the order to start with the table you are reporting on and use unique aliases for the joined table. Something like:
SELECT MN.String AS 'MachineName', AN.String as 'AlarmName' FROM Alarm A
JOIN UniqueString MN ON A.MachineName = MN.Id
JOIN UniqueString AN ON A.AlarmName = AN.Id
etc for any other columns

How to Transpose a resultset from SQL

I am using Microsoft SQL Server 2008.
I have a table that looks something like this:
|======================================================|
| RespondentId | QuestionId | AnswerValue | ColumnName |
|======================================================|
| P123 | 1 | Y | CanBathe |
|------------------------------------------------------|
| P123 | 2 | 3 | TimesADay |
|------------------------------------------------------|
| P123 | 3 | 1.00 | SoapPrice |
|------------------------------------------------------|
| P465 | 1 | Y | CanBathe |
|------------------------------------------------------|
| P465 | 2 | 1 | TimesADay |
|------------------------------------------------------|
| P465 | 3 | 0.99 | SoapPrice |
|------------------------------------------------------|
| P901 | 1 | N | CanBathe |
|------------------------------------------------------|
| P901 | 2 | 0 | TimesADay |
|------------------------------------------------------|
| P901 | 3 | 0.00 | SoapPrice |
|------------------------------------------------------|
I would like to flip the rows to be columns so that this table looks like this:
|=================================================|
| RespondentId | CanBathe | TimesADay | SoapPrice |
|=================================================|
| P123 | Y | 3 | 1.00 |
|-------------------------------------------------|
| P465 | Y | 1 | 0.99 |
|-------------------------------------------------|
| P901 | N | 0 | 0.00 |
|-------------------------------------------------|
(the example data here is arbitrarily made up, so its silly)
The source table is a temp table with approximately 70,000 rows.
What SQL would I need to write to do this?
Update
I don't even know if PIVOT is the right way to go.
I don't know what column to PIVOT on.
The documentation mentions <aggregation function> and <column being aggregated> and I don't want to aggregate anything.
Thanks in advance.
It, is required to use an aggregate function if you use PIVOT. However, since your (RespondentId, QuestionId) combination is unique, your "groups" will have only one row, so you can use MIN() as an aggregate function:
SELECT RespondentId, CanBathe, TimesADay, SoapPrice
FROM (SELECT RespondentId, ColumnName, AnswerValue FROM MyTable) AS src
PIVOT (MIN(AnswerValue) FOR ColumnName IN(CanBathe, TimesADay, SoapPrice)) AS pvt
If a group only contain one row, then MIN(value) = value, or in other words: the aggregate function becomes the identity function.
See if this gets you started. Used to have to use CASE statements to make that happen but it looks like some inkling of PIVOT is in SQL Server now.
PIVOT is a start, but the thing with sql queries is that you really need to know what columns to expect in the result set before writing the query. If you don't know this, the last time I checked you have to either resort to dynamic sql or allow the client app that retrieves the data to do the pivot instead.