how to avoid long sql in clickhouse - sql

As ClickHouse can run complex SQL efficiently, we now write hundred-lines-in-one SQL to get the real time analysis results(join lots of source table). And it brings a problem is that it's so hard to maintain and reuse.
And we are using mybatis to write the sql in our Java Application. We try to use Mybatis's global parameter to define some sql for reuse, but not helps a lot in such huge lines SQL.
Is there any way to avoid such long sql or make the sql more readable and make this work more engineerable?
<select id="brandAnalysis" resultType="java.util.HashMap">
SELECT gmvRank.id_code,
gmvRank.rank,
gmvRank.lastRank,
gmvRank.gmv as sale_money,
member.dealerMembers as dealer_turnover,
member.teamMembers,
case
when member.dealerMembers = 0 then 0
when member.teamMembers = 0 then 0
else member.dealerMembers/member.teamMembers end AS dealer_turnover_active
FROM (
SELECT id_code,
<choose>
<when test="monthQuarter == 3">
case
when max(`quarter`) = #{quarter} then argMax(${sqlRankField},`month`)
else null end as rank,
case
when max(`quarter`) = #{quarter} then argMax(${sqlGmvField},`month`)
else null end as gmv,
case
when min(`quarter`) = #{compareQuarter} then argMin(${sqlRankField},`month`)
else null end as lastRank
</when>
<otherwise>
case
when max(`month`) = #{month} then argMax(${sqlRankField},`month`)
else null end as rank,
case
when max(`month`) = #{month} then argMax(${sqlGmvField},`month`)
else null end as gmv,
case
when min(`month`) = #{compareMonth} then argMin(${sqlRankField},`month`)
else null end as lastRank
</otherwise>
</choose>
FROM
tengju.dws_crm_brand_analysis_rank_dd_m
<choose>
<when test="monthQuarter == 3">
PREWHERE quarter in (#{quarter},#{compareQuarter})
and insert_time in (
SELECT MAX(insert_time)
FROM tengju.dws_crm_brand_analysis_rank_dd_m
PREWHERE quarter in (#{quarter},#{compareQuarter})
GROUP BY quarter
)
</when>
<otherwise>
PREWHERE month in (#{month},#{compareMonth})
and insert_time in (
SELECT MAX(insert_time)
FROM tengju.dws_crm_brand_analysis_rank_dd_m
PREWHERE month in (#{month},#{compareMonth})
GROUP BY month
)
</otherwise>
</choose>
<if test="dominationIdCode != null and dominationLevel != null">
and id_code in (
SELECT id_code
FROM tengju.dwd_user_domination_relation_map_all
PREWHERE visit_date >= formatDateTime(yesterday(),'%F')
AND insert_time in (
SELECT max(insert_time)
FROM tengju.dwd_user_domination_relation_map_all
PREWHERE visit_date >= formatDateTime(yesterday(),'%F')
)
AND domination_star_level = #{dominationLevel}
AND domination_id_code = #{dominationIdCode}
<if test="starLevels != null and starLevels.size() > 0">
AND star_level in
<foreach collection="starLevels" item="item" index="index" open="(" close=")" separator=",">
#{item}
</foreach>
</if>
<if test="isDirect != null and isDirect > 0">
AND parent_id_code = #{dominationIdCode}
</if>
)
</if>
<if test="idCodes != null and idCodes.size() > 0">
AND id_code in
<foreach collection="idCodes" item="item" index="index" open="(" close=")" separator=",">
#{item}
</foreach>
</if>
AND brand_id = #{brandId}
GROUP BY id_code
) as gmvRank
GLOBAL LEFT JOIN (
<choose>
<when test="crmRankCrowdPerspective == 4">
<include refid="smallTeamDealer" />
</when>
<when test="crmRankCrowdPerspective == 1">
<include refid="KATeamDealer" />
</when>
<otherwise>
<include refid="ecOrCityTeamDealer" />
</otherwise>
</choose>
) as member
on (gmvRank.id_code = member.id_code)
where 1=1
<if test="saleMoneyFrom != null">
AND gmvRank.gmv >= toDecimal128(#{saleMoneyFrom},2)
</if>
<if test="saleMoneyTo != null">
AND toDecimal128(#{saleMoneyTo},2) >= gmvRank.gmv
</if>
order by ${sortType}
LIMIT #{offset},#{pageSize}
</select>

User common table expressions to get rid of nested operations. Each make and version of database server has its own style of SQL. Not all have common table expressions. All have views though. SQL queries can be insanely verbose, for sure there is no other option.

Related

Split XML report in SQL

I have below SQL stored procedure which generates a report and sends to me via email.
However it sends all the various rows as one big chunk report.
Rather I would want it to split each error
Any idea how I can split the below query to have individual XML reports generated.
DECLARE #Report XML
BEGIN
UPDATE [Orders].dbo.PurOrd
SET [Status] = 'Failed', Reason = '<e id="0" message="failed test order" />'
WHERE [Status] = 'InProcess'
SET #Report = (
SELECT
p.Name as "#Name",
p.Customer "#Customer",
CASE p.Name
WHEN 'Default' THEN convert(xml,p.RejectedReason)
ELSE convert(xml,f.RejectedReason)
END AS "RejectedReason",
(
SELECT u.first_name as "#FirstName",
u.last_name as "#LastName",
FROM [Users].dbo.Users u
WHERE u.user_id = u.user_id
for xml PATH('Users'), type
),
(
SELECT
li.Product as "#PId",
li.Quantity as "#Quantity",
li.SalePrice as "#Price",
FROM [Cart].dbo.LineItems li
WHERE li.OrderFormId = f.OrderFormId
ORDER BY li.ItemNumber
for xml PATH('LineItem'), type
)
FROM [Orders].dbo.OrForms f
JOIN [Orders].dbo.PurOrd p on f.GroupId = p.GroupId
WHERE
(p.Status = 'OrderRejected' AND p.ReportStatus IS NULL)
FOR XML PATH('test'), TYPE
)
SELECT #Report FOR XML PATH('Report')
Current Output:
<Report>
<PurchaseOrder OrderId="Order 1" Name="name1" Seller="abc">
<Reason>
<errors>
<e id="0" message="failed test ord" />
</errors>
</Reason>
<Users FirstName="abc" LastName="xyz"/>
<LineItem ProductId="Clothes1" Quantity="1.0000" SalePrice="100rs"/>
</PurchaseOrder>
<PurchaseOrder OrderId="Order 2" Name="name 2" Seller="abc">
<Reason>
<errors>
<e id="0" message="failed test ord" />
</errors>
</Reason>
<Users FirstName="abc" LastName="xyz"/>
<LineItem ProductId="Clothes1" Quantity="1.0000" SalePrice="100rs"/>
</PurchaseOrder>
</Report>
EXPECTED OUTPUT:
<Report>
<PurchaseOrder OrderId="Order 1" Name="name1" Seller="abc">
<Reason>
<errors>
<e id="0" message="failed test ord" />
</errors>
</Reason>
<Users FirstName="abc" LastName="xyz"/>
<LineItem ProductId="Clothes1" Quantity="1.0000" SalePrice="100rs"/>
</PurchaseOrder>
</Report>
<Report>
<PurchaseOrder OrderId="Order 2" Name="name 2" Seller="abc">
<Reason>
<errors>
<e id="0" message="failed test ord" />
</errors>
</Reason>
<Users FirstName="abc" LastName="xyz"/>
<LineItem ProductId="Clothes1" Quantity="1.0000" SalePrice="100rs"/>
</PurchaseOrder>
</Report>
One way to get
<Report>
<PurchaseOrder ... >
...
</PurchaseOrder>
</Report>
<Report>
<PurchaseOrder ...>
...
</PurchaseOrder>
</Report>
structure is to wrap your select list into CROSS APLLY. Kind of
SELECT t.PurchaseOrder
FROM [Orders].dbo.OrForms f
JOIN [Orders].dbo.PurOrd p on f.GroupId = p.GroupId
CROSS APPLY (
SELECT
-- original select list copy
p.Name as "#Name",
p.Customer "#Customer",
CASE p.Name
WHEN 'Default' THEN convert(xml,p.RejectedReason)
ELSE convert(xml,f.RejectedReason)
END AS "RejectedReason",
(
SELECT u.first_name as "#FirstName",
u.last_name as "#LastName",
FROM [Users].dbo.Users u
WHERE u.user_id = u.user_id
for xml PATH('Users'), type
),
(
SELECT
li.Product as "#PId",
li.Quantity as "#Quantity",
li.SalePrice as "#Price",
FROM [Cart].dbo.LineItems li
WHERE li.OrderFormId = f.OrderFormId
ORDER BY li.ItemNumber
for xml PATH('LineItem'), type
)
-- original select list copy END
FOR XML PATH(''), TYPE
) t(PurchaseOrder)
WHERE
(p.Status = 'OrderRejected' AND p.ReportStatus IS NULL)
FOR XML PATH('Report'), TYPE;

Dynamic SELECT query which decides whether to use the WHERE clause in Mule 4

I am using Mule 4 and Anypoint 7 and want to setup the database connector to SELECT all customers from my SQL server database table but if the customerName query parameter is populated in the request then I want to add the WHERE clause to only return customers with the same name as the customerName query parameter otherwise it should just return all customers.
My code is below but I am struggling to get the syntax correct.
<db:select doc:name="Select Customers" doc:id="98a4aa2f-b0b6-4fb5-ab27-d70489fd532d" config-ref="db-config">
<ee:repeatable-file-store-iterable />
<db:sql >SELECT TOP 10 * FROM MYDB.dbo.Customer $(if (attributes.queryParams.customerName != null and isEmpty(attributes.queryParams.customerName) == false) "WHERE Name = :customerName" else "")</db:sql>
<db:input-parameters ><![CDATA[#[{'customerName' : attributes.queryParams.customerName}]]]></db:input-parameters>
</db:select>
How can I do this?
Thanks
You were on the right path. I think you were only missing the evaluation tags around the SQL in the db:sql element.
<db:select doc:name="Select Customers" doc:id="98a4aa2f-b0b6-4fb5-ab27-d70489fd532d"
config-ref="db-config">
<ee:repeatable-file-store-iterable />
<db:sql>#["SELECT TOP 10 * FROM MYDB.dbo.Customer
$(if (isEmpty(attributes.queryParams.customerName) == false) "WHERE Name = :customerName" else "") "]</db:sql>
<db:input-parameters ><![CDATA[#[{'customerName' : attributes.queryParams.customerName}]]]></db:input-parameters>
</db:select>
It is easier to debug things like this with variables, so that you can see the individual values. FWIW heres my test code:
<set-variable variableName="additionalWhereClause"
value='#[if ( isEmpty(attributes.queryParams.email) == false)
"WHERE Email = :emailParm"
else "" ]' />
<set-variable variableName="selectSql"
value="#['SELECT FirstName, LastName, Email
FROM User
$( vars.additionalWhereClause )
ORDER BY Email LIMIT 10']" />
<logger level="INFO" message="queryParams: #[attributes.queryParams]" doc:id="96c62f84-2c98-4df6-829c-e00c9fcec9ca" />
<logger level="INFO" message="additionalWhereClause #[vars.additionalWhereClause]" doc:id="0d3611b4-34ae-4ebb-b931-6d31ce3804c1" />
<logger level="INFO" message="selectSql #[vars.selectSql]" doc:id="5c56342d-9674-4891-9d7e-bb32319f4ad0" />
<db:select doc:name="MySQL Query" doc:id="e60be3e6-9b51-4b3b-9dfa-4ee0af65cb03"
config-ref="mysql-config">
<ee:repeatable-file-store-iterable />
<db:sql>#[ vars.selectSql ]</db:sql>
<db:input-parameters><![CDATA[#[{'emailParm' : attributes.queryParams.email}]]]></db:input-parameters>
</db:select>

How to get tag attribute value from xml

I have a xml where the values are given:
<User version="5.1.0.1" ... (omitted for brevity) >
<Login usewinLogin="true" passwordNeverExpires="false" />
<Misc />
<AdditionalInfo />
<OutOfOfficeSettings isOutOfOffice="false" startDateTime="0001-01-01T00:00:00.0000000Z" />
<RegionalSettings language="de" culture="de" />
</User>`
I would need the value from OutofOfficeSettings isOutofOffice= (Options are false or true).
I never did this before, is someone here to help me?
One option
Declare #DWUser table (Active int, settings xml )
Insert Into #DWUser values
(1,'<User version="5.1.0.1" uid="1" oid="2" w3uid="ADMIN" guid="05b6f2a2-cdf8-4b73-8d1b-86eab657654" oguid="43a0d394-e515-45ea-aa25-7822673c52a3" name="admin" active="true" eMail="admin#admin.com" distribution="NetworkStartup" securityLevel="Normal" defaultWebBasket="cc1c589a-549a-4957-959b-3b5acc4decc9"><Login usewinLogin="true" passwordNeverExpires="false" /><Misc /><AdditionalInfo /><OutOfOfficeSettings isOutOfOffice="false" startDateTime="0001-01-01T00:00:00.0000000Z" /><RegionalSettings language="de" culture="de" /></User>')
,(1,'<root><OtherXML>Hi</OtherXML></root>')
Select A.active
,Type = X.attr.value('#isOutOfOffice','varchar(100)')
From #DWUser A
Outer Apply A.settings.nodes('User/OutOfOfficeSettings') as X(attr)
Where A.Active=1
Returns
active Type
1 false
1 NULL

Using SQL to Generate XML

I'm trying to use SQL to generate XML in the format:
<ImportSession>
<Batches>
<Batch>
<BatchFields>
<BatchField Name="Field1" Value="1" />
<BatchField Name="Field2" Value="2" />
<BatchField Name="Field3" Value="3" />
</BatchFields>
<Batch>
<Batches>
</ImportSession>
I'm using SQL Server 2008. I wrote this query:
SELECT
(SELECT
(SELECT
'Col' AS [#Name],
FiscalYear AS [#Value]
FROM [ICEM].[dbo].[ExportedBill]
WHERE ExportedBillID = 1
FOR XML PATH ('BatchField'), TYPE)
FROM [ICEM].[dbo].[ExportedBill]
WHERE ExportedBillID = 1
FOR XML PATH ('BatchFields'), ROOT ('Batch'), TYPE)
FROM
[ICEM].[dbo].[ExportedBill]
WHERE
ExportedBillID = 1
FOR XML PATH ('Batches'), ROOT ('ImportSession')
And this results in:
<ImportSession>
<Batches>
<Batch>
<BatchFields>
<BatchField Name="Col" Value="2015" />
</BatchFields>
</Batch>
</Batches>
</ImportSession>
What I need though is every column should have an entry in BatchField. Also I need the column name to show up in the name. So I should get:
<BatchField Name="FiscalYear" Value="2015" />
<BatchField Name="MeterNumber" Value="123456" />
<BatchField Name="Name" Value="John Smith" />
<BatchField Name="Utility" Value="Electricity" />
So can anyone tell me how I modify my query to get what I need?
EDIT:
I figured it out. I needed a second nested Select. I need one for each column. If they proceeding selects use the same tags as a previous Select then the information is concatanated under the same parent tag
SELECT
(SELECT
(SELECT
'FiscalYear' AS [#Name],
FiscalYear AS [#Value]
FROM [ICEM].[dbo].[ExportedBill]
WHERE ExportedBillID = 1
FOR XML PATH ('BatchField'), TYPE),
(SELECT 'FiscalPeriod' AS [#Name],
FiscalPeriod AS [#Value]
FROM [PEEL_ICEM].[dbo].[ExportedBill]
WHERE ExportedBillID = 1
FOR XML PATH ('BatchField'), TYPE)
FROM [ICEM].[dbo].[ExportedBill]
WHERE ExportedBillID = 1
FOR XML PATH ('BatchFields'), ROOT ('Batch'), TYPE)
FROM
[ICEM].[dbo].[ExportedBill]
WHERE
ExportedBillID = 1
FOR XML PATH ('Batches'), ROOT ('ImportSession')
Thing is though, there will be around 70 columns in this table. Ill brute force it for now, but if anyone knows of a better way to do this please let me know. Cheers
You can create separate child elements by adding a blank column separator. e.g.
DECLARE #T TABLE
( FiscalYear INT,
MeterNumber INT,
Name VARCHAR(255),
Utility VARCHAR(255)
);
INSERT #T VALUES (2015, 123456, 'John Smith', 'Electricity');
SELECT [BatchField/#Name] = 'FiscalYear',
[BatchField/#Value] = FiscalYear,
'',
[BatchField/#Name] = 'MeterNumber',
[BatchField/#Value] = MeterNumber,
'',
[BatchField/#Name] = 'Name',
[BatchField/#Value] = Name,
'',
[BatchField/#Name] = 'Utility',
[BatchField/#Value] = Utility
FROM #T
FOR XML PATH('BatchFields'), ROOT('Batch');
Which gives:
<Batch>
<BatchFields>
<BatchField Name="FiscalYear" Value="2015" />
<BatchField Name="MeterNumber" Value="123456" />
<BatchField Name="Name" Value="John Smith" />
<BatchField Name="Utility" Value="Electricity" />
</BatchFields>
</Batch>

How to write dependent queries in MULE CE?

I hear that we con't write two Inbound-endpoints in one flow, but below is my requirement, Second select query needs first query payload value. When I run this giving an exception. Please let me know if there is any alternate way to write this.
<flow name="Some">
<jdbc:inbound-endpoint queryKey="SelectSome"
connector-ref="ProConnector" doc:name="SomeFromPro"
pollingFrequency="10000" queryTimeout="-1">
<jdbc:transaction action="NONE" timeout="10" />
<jdbc:query key="SelectSome"
value="SELECT top 1 * from table1 where IsProcessed = 0" />
<jdbc:query key="SelectSome.ack"
value="update table1 set IsProcessed=1 where ID = #[map-payload:ID] " />
</jdbc:inbound-endpoint>
<jdbc:inbound-endpoint queryKey="SelectSomeBR"
connector-ref="ProConnector" doc:name="SomeBRFromPro"
pollingFrequency="1000" queryTimeout="-1">
<jdbc:transaction action="NONE" timeout="10" />
<jdbc:query key="SelectSomeBR"
value="SELECT * from table2 where IsProcessed = 0 and ParentID = #[map-payload:ID]" />
<jdbc:query key="SelectSomeBR.ack"
value="update table2 set IsProcessed=1 where ParentID = #[map-payload:ID] " />
</jdbc:inbound-endpoint>
.
.
.
.
</flow>
Any help would be appreciated.
Thank you.
You need to use an outbound select query for your second query, like:
<jdbc:outbound-endpoint queryKey="SelectSomeBR"
connector-ref="ProConnector" doc:name="SomeBRFromPro"
pollingFrequency="1000" queryTimeout="-1"
exchange-pattern="request-response">
<jdbc:transaction action="NONE" timeout="10" />
<jdbc:query key="SelectSomeBR"
value="SELECT * from table2 where IsProcessed = 0 and ParentID = #[map-payload:ID]" />
<jdbc:query key="SelectSomeBR.ack"
value="update table2 set IsProcessed=1 where ParentID = #[map-payload:ID] " />
</jdbc:outbound-endpoint>
Reference: https://developer.mulesoft.com/docs/display/current/JDBC+Transport+Reference#JDBCTransportReference-OutboundSELECTQueries
PS. This old expression syntax is obsolete: #[map-payload:ID] Use MEL instead: #[message.payload.ID]" (assuming the message payload is a map with a key named ID).