Pentaho database lookup to fetch data - sql

I have a database table, "FTTAGS" which has below 3 fields.
FTDATA
INST
KEY
FTDATA has value as shown below.
19,40,92,27456,1,9,6,7,121,6,7,5,01,24001,523247,19,005,12,6,7,222,2,98,1241222514,0,3933602,2745,8,1,1,1,1,1,1,1,16,6,6,6,335,19,40,92,2745,1,9,6,7,2745,8,1,1,1,1,1,1,1,16,6,6,6,335,98,5,01,204198,192,9,47,47,20,5,12,6,7,12,6,7
INST has value
"Frequecy -21 0x337811 gf.2241"
I need to search for "3933602" and fetch corresponding KEY .
I am not able to get any option in database lookup to do the same. Please help. I tried running a sql query using "Execute sql script". But it is not returning any values.
I tried below in execute sql script. But it did not return the output value.
select KEY from FTTAGS where INSTR(FTDATA,'?') > 0 AND INST like '?';

By using Database join step. You can achieve the goal. Let us see how to achieve these goal.
SQL: select * from FTTAGS where FTDATA = 3933602;
Take Database join step to canvas.
SQL: select * from FTTAGS where FTDATA = ?;
Follow below image for clear clarification.
If you want both results. You can follow below image.
Database join step:
SQL: select * from FTTAGS
Filter step:
<field> => FTDATA
operateor => =
<value> => 3933602
=> Red mark dummy step gives not satisfied result.
=> green tick mark dummy step gives satisfied result.
I think these information will useful to you.
Database join_1 step:
SQL: select * from FTTAGS where FTDATA = 3933602
Database join_2 step:
SQL: select * from FTTAGS where FTDATA <> 3933602
For clear clarification purpose you can see the below image
Thank you.

Related

How to Take two SQL queries of a column of substrings to search database using the appended result of the other

I have two select statements that both contain a column of substrings that derived from a database table. They are substrings derived from a varchar that should be an XML, but were saved as varcars because they could be not well-formed and potentially invalid.
I am trying to take the table that results in the 1st query, a list of 50 Varchars, and search the database using the 2nd query. I could get from 0 to n SQLRelatesMessageID sets from each SQLmessageID if I use each row in the first query and append a string to get the node ("z4480" is an example here).
I have tried a cursor implementation but the performance detered me from finishing it. Join doesn't work if you try giving the substring column with an as alias. What steps should I do to get the overall list of SQLRelatesMessageIDs. My goal is to get all MessageLogId (3 in picture) given a NCPDPID.
I am using SQL Server Manager 2012.
--1--Recieves a list based on a given NCPDPID node Value
select substring(m.message, charindex('<MessageID>', m.message)+11, charindex('</MessageID>', m.message)-charindex('<MessageID>', m.message)-11) as
SQLmessageID from messagelog m where message like '%<NCPDPID>'+'1234567'+'</NCPDPID>%'
--2--Selects messageID from top select and searches RelatesToMessageID node
select substring(r.message, charindex('<RelatesToMessageID>', r.message)+20, charindex('</RelatesToMessageID>', r.message)-charindex('<RelatesToMessageID>', r.message)-20) as SQLRelatesMessageID, * from messagelog r
where message like ('%<RelatesToMessageID>'+'z4480'+'</RelatesToMessageID>%')
This works for this answer.
---main
SELECT * FROM
(
select substring(m.message, charindex('<MessageID>', m.message)+11, charindex('</MessageID>', m.message)-charindex('<MessageID>', m.message)-11) as SQLmessageID from messagelog m
where message like '%<NCPDPID>1234567</NCPDPID>%' and dateTime > '3/01/2016'
) a JOIN
(
select
substring(r.message, charindex('<RelatesToMessageID>', r.message)+20, charindex('</RelatesToMessageID>', r.message)-charindex('<RelatesToMessageID>', r.message)-20) as SQLRelatesMessageID,
message,
messagelogid from messagelog r
where
dateTime > '3/01/2016' AND
message LIKE ('%<RelatesToMessageID>%</RelatesToMessageID>%')
) b ON b.SQLRelatesMessageID = a.SQLmessageID

SQL returns the unique identifier instead of the value in my Access UNON ALL SQL

So here is my project using MS Access 2010,
I have developed 2 queries to select 2 different reading periods. These queries are called CycleStart and CycleEnd. When I run these 2 queries individually I get expected output results. these 2 queries pull data from tables with a couple lookup fields in them. So the lookup fields use other tables where there are only 2 columns. The next step I use SQL to create a UNION ALL query to bring these 2 cycle queries together for reporting purposes. The problem I run into is that my resulting Union query does not output the same information as the 2 individual cycle queries.
Now the specific issues. My cycle queries have a couple lookup fields referencing another table. For example the Read_Cycle field comes for a table(Read_Cycles) and only has 2 columns, the unique identifer assigned by Access and the Read_Cycle column with the data I enter. When I run the cycle queries the field for Read_Cycle returns the Read_Cycle data as expected, but the union query does not. So here is some structure of my project:
Read_Cycles Table
|ID Col1 | |Cycle_ID Col2|
1 Spring
2 Fall
3 Winter
The data tables behind the CycleStart and the CycleEnd have fields that are lookup values referencing the above described Read_Cycles table.
Query CycleStart and CycleEnd return Spring or fall or winter, which ever value is associated with the record, correctly.
however, the problem I have is that the Union SQL Query returns the ID instead of the value, so instead of getting Fall, I get the 2.
Here is my UNION ALL SQL........
SELECT "CycleEnd" AS source,
[CycleEnd].[Recloser_SerialNo],
[CycleEnd].[Read_Date],
[CycleEnd].[3_Phase_Reading],
[CycleEnd].[A_Phase_Reading],
[CycleEnd].[B_Phase_Reading],
[CycleEnd].[C_Phase_Reading],
[CycleEnd].[Read_Cycle],
[CycleEnd].[PoleNo],
[CycleEnd].[Substation],
[CycleEnd].[Feeder],
[CycleEnd].[Feeder_Description],
[CycleEnd].[Recloser_Location]
FROM [CycleEnd]
UNION ALL
SELECT "CycleStart" AS source,
[CycleStart].[Recloser_SerialNo],
[CycleStart].[Read_Date],
[CycleStart].[3_Phase_Reading] * - 1,
[CycleStart].[A_Phase_Reading] * - 1,
[CycleStart].[B_Phase_Reading] * - 1,
[CycleStart].[C_Phase_Reading] * - 1,
[CycleStart].[Read_Cycle],
[CycleStart].[PoleNo],
[CycleStart].[Substation],
[CycleStart].[Feeder],
[CycleStart].[Feeder_Description],
[CycleStart].[Recloser_Location]
FROM [CycleStart];
All other fields are coming across just fine and as expected, I have narrowed it down to only fields that are a lookup in the original tables.
Any help would be greatly appreciated. Also my SQL experience is really limited so example code would help greatly.
UPDATE:
here is the sql from the CycleEnd that works. I got this by building the query then changing to the SQL view...
SELECT Recloser_Readings.Recloser_SerialNo,
Recloser_Readings.Read_Date,
Recloser_Readings.[3_Phase_Reading],
Recloser_Readings.A_Phase_Reading,
Recloser_Readings.B_Phase_Reading,
Recloser_Readings.C_Phase_Reading,
Recloser_Locations.PoleNo,
Recloser_Locations.Substation,
Recloser_Locations.Feeder,
Recloser_Locations.Feeder_Description,
Recloser_Locations.Recloser_Location,
Recloser_Readings.Read_Cycle
FROM (
Recloser_Inventory LEFT JOIN Recloser_Locations
ON Recloser_Inventory.PoleNo = Recloser_Locations.PoleNo
)
RIGHT JOIN Recloser_Readings
ON Recloser_Inventory.Serial_No = Recloser_Readings.Recloser_SerialNo
WHERE (((Recloser_Readings.Read_Cycle) = "8"));
UPDATE#2
I noticed I grabbed the wrong code that references the Read_Cycles table. Here it is...
SELECT Read_Cycles.Cycle_ID, Read_Cycles.ID
FROM Read_Cycles
ORDER BY Read_Cycles.Cycle_ID DESC;
UPDATE : SYNTAX ERROR FROM THE FOLLOWING CODE!!
SELECT "CycleEnd" as source,
[CycleEnd].[Recloser_SerialNo],
[CycleEnd].[Read_Date],
[CycleEnd].[3_Phase_Reading],
[CycleEnd].[A_Phase_Reading],
[CycleEnd].[B_Phase_Reading],
[CycleEnd].[C_Phase_Reading],
[CycleEnd].[Read_Cycle],
[CycleEnd].[PoleNo],
[CycleEnd].[Substation],
[CycleEnd].[Feeder],
[CycleEnd].[Feeder_Description],
[CycleEnd].[Recloser_Location]
FROM [CycleEnd] JOIN [Read_Cycles] ON [CycleEnd].[Read_Cycle] = [Read_Cycles].[ID]
UNION ALL SELECT "CycleStart" as source,
[CycleStart].[Recloser_SerialNo],
[CycleStart].[Read_Date],
[CycleStart].[3_Phase_Reading]*-1,
[CycleStart].[A_Phase_Reading]*-1,
[CycleStart].[B_Phase_Reading]*-1,
[CycleStart].[C_Phase_Reading]*-1,
[CycleStart].[Read_Cycle],
[CycleStart].[PoleNo],
[CycleStart].[Substation],
[CycleStart].[Feeder],
[CycleStart].[Feeder_Description],
[CycleStart].[Recloser_Location]
FROM [CycleStart] JOIN [Read_Cycles] ON [CycleStart].[Read_Cycle] = [Read_Cycles].[ID];

MS SQL complex update query

I've multiple tables and try to update some tables based on matching translation in master tables. I could update the table based on the first available translation, but I would like to do it based on the most frequent one. I found quite a lot of samples on how to achieve that in simple queries, but can't have it work in this complex query below.
In brief I just need to take the most frequent translation [target] from [EC3_800_FR_M] and copy it to [EC3_800_FR], instead of taking the first occurence in the table.
UPDATE [ec3_800_fr]
SET [ec3_800_fr].[target] = (SELECT TOP 1
[ec3_800_fr_m].[target]
FROM [ec3_800_fr_m]
WHERE (
[ec3_800_fr].[enus] = [ec3_800_fr_m].[enus] AND
[ec3_800_fr].[length] = [ec3_800_fr_m].[length] AND
[ec3_800_fr].[key6_domainname] = [ec3_800_fr_m].[key6_domainname] AND
[ec3_800_fr_m].[status] > 1
)
);
who can help me???
merci - thank you - Dank u - Danke
BR
lolo

How would I implement this query in an SSIS dataflow?

So I have something that I did in an execute SQL task but my project manager would rather see it in a data flow task.
INSERT INTO [dbo].[lookup_product]
([dim_global_data_source_id]
,[source_product]
,[source_product_type]
,[source_grade]
,[source_gauge]
,[source_width]
)
SELECT distinct
dim_global_data_source_id,
product_desc,
product_type,
grade,
gauge,
size1
FROM Staging_informix_Coil_is
where not exists
(select source_product
from lookup_product
where lookup_product.dim_global_data_source_id = Staging_informix_Coil_is.dim_global_data_source_id
and isnull(lookup_product.source_product,'') = isnull(Staging_informix_Coil_is.product_desc,'')
and lookup_product.source_product_type = Staging_informix_Coil_is.product_type
and isnull(lookup_product.source_grade,'') = isnull(Staging_informix_Coil_is.grade,'')
and isnull(lookup_product.source_gauge,0) = isnull(Staging_informix_Coil_is.gauge,0)
and isnull(lookup_product.source_width,0) = isnull(Staging_informix_Coil_is.size1,0)
)
`
That's the query. I need this in a workflow. Someone help me out or give me a sample
I'm with your project manager on this one. I would create a Data Flow Task. The first component would be an OLE DB Source, containing just your first SELECT (no WHERE clause).
The next component would be a Lookup, selecting the columns you need to match on from lookup_product. On the Columns tab I would match the columns as you have in your WHERE clause. On the General tab I would set it to Redirect Rows to No Match output.
The final component is an OLE DB Destination, pointing at the lookup_product table. I would connect this to the Lookup using the No Match output.

Optimize this query getting exceed recourse limit

SELECT DISTINCT
A.IDPRE
,A.IDARTB
,A.TIREGDAT
,B.IDDATE
,B.IDINFO
,C.TIINTRO
FROM
GLHAZQ A
,PRTINFO B
,PRTCON C
WHERE
B.IDARTB = A.IDARTB
AND B.IDPRE = A.IDPRE
AND C.IDPRE = A.IDPRE
AND C.IDARTB = A.IDARTB
AND C.TIINTRO = (
SELECT MIN(TIINTRO)
FROM
PRTCON D
WHERE D.IDPRE = A.IDPRE
AND D.IDARTB = A.IDARTB)
ORDER BY C.TIINTRO
I get below error when I run this query(DB2)
SQL0495N Estimated processor cost of "000000012093" processor seconds
("000575872000" service units) in cost category "A" exceeds a resource limit error
threshold of "000007000005" service units. SQLSTATE=57051
Please help me to fix this problem
Apparently, the workload manager is doing its job in preventing you from using too many resources. You'll need to tune your query so that its estimated cost is lower than the threshold set by your DBA. You would start by examining the query explain plan as produced by db2exfmt. If you want help, publish the plan here, along with the table and index definitions.
To produce the explain plan, perform the following 3 steps:
Create explain tables by executing db2 -tf $INSTANCE_HOME/sqllib/misc/EXPLAIN.DDL
Generate the plan by executing the explain statement: db2 explain plan for select ...<the rest of your query>
Format the plan: db2exfmt -d <your db name> -1 (note the second parameter is the digit "1", not the letter "l").
To generate the table DDL statements use the db2look utility:
db2look -d <your db name> -o tables.sql -e -t GLHAZQ PRTINFO PRTCON
Although not a db2 person, but I would suspect query syntax is the same. In your query, you are doing a sub-select based on the C.TIINTRO which can kill performance. You are also querying for all records.
I would start the query by pre-querying the MIN() value and since you are not even using any other value field from the "C" alias, leave it out.
SELECT DISTINCT
A.IDPRE,
A.IDARTB,
A.TIREGDAT,
B.IDDATE,
B.IDINFO,
PreQuery.TIINTRO
FROM
( SELECT D.IDPRE,
D.IDARTB,
MIN(D.TIINTRO) TIINTRO
from
PRTCON D
group by
D.IDPRE,
D.IDARTB ) PreQuery
JOIN GLHAZQ A
ON PreQuery.IDPre = A.IDPRE
AND PreQuery.IDArtB = A.IDArtB
JOIN PRTINFO B
ON PreQuery.IDPre = B.IDPRE
AND PreQuery.IDArtB = B.IDArtB
ORDER BY
PreQuery.TIINTRO
I would ensure you have indexes on
table Index keys
PRTCON (IDPRE, IDARTB, TIINTRO)
GLHAZQ (IDPRE, IDARTB)
PRTINFO (IDPRE, IDARTB)
If you really DO need your "C" table, you could just add as another JOIN such as
JOIN PRTCON C
ON PreQuery.IDArtB = C.IDArtB
AND PreQuery.TIIntro = C.TIIntro
With such time, you might be better having a "covering index" with
GLHAZQ table key ( IDPRE, IDARTB, TIREGDAT )
PRTINFO (IDPRE, IDARTB, IDDATE, IDINFO)
this way, the index has all the elements you are returning in the query vs having to go back to all the actual pages of data. It can get the values from the index directly