Query Variable Table without storing variables - sql

Salesforce Marketing Cloud queries do not allow variables or temporary tables according to the "SQL Support" section of this official documentation (http://help.marketingcloud.com/en/documentation/exacttarget/interactions/activities/query_activity/)
I have a data extension called Parameters_DE with fields Name and Value that stores constant values. I need to refer to this DE in queries.
Using transact-SQL, an example is:
Declare #number INT
SET #number = (SELECT Value FROM Parameters_DE WHERE Name='LIMIT')
SELECT * FROM Items_DE
WHERE Price < #number
How can the above be done without variables or temporary tables so that I can refer to the value of the 'LIMIT' variable that is stored in Parameters_DE and so that the query will work in Marketing Cloud?

This is what I would have done anyway, even if variables are allowed:
SELECT i.*
FROM Items_DE i
INNER JOIN Parameters_DE p ON p.Name = 'LIMIT'
WHERE i.Price < p.Value
Wanting to a use a variable is indicative of still thinking procedural, instead of set-based. Note that, if you need to, you can join to the Parameters_DE table more than once (give a difference alias each time) to use the values of different parameters at different parts in a query.
You can also make things more efficient for this type of query by having a parameters table with one row, and a column for each value you need. Then you can JOIN to the table one time with a 1=1 condition and look at just the columns you need. Of course, this idea has limitations, too.

You could just use the SELECT which retrieves the number in your WHERE clause:
SELECT * FROM Items_DE
WHERE Price < (SELECT Value FROM Parameters_DE WHERE Name='LIMIT')

This can be done with a join
SELECT i.*
FROM Items_DE i
INNER JOIN Parameters_DE p
ON p.Name = 'LIMIT'
AND p.Price > i.Value

Related

Can I leverage BigQuery (BQ) partition via a join?

I am a Tableau designer, and we are building some views that get filtered by category a lot. Because of this, we tried to create a category_id that would serve as partition. The problem seems to be that if I filter data category only, the partition doesn't get used and the total table GB and cost gets hit.
Our team is trying to see if this could be minimized by using a nested query as follows:
SELECT *
FROM table a
INNER JOIN (
SELECT DISTINCT category_id, category
FROM table
) b
ON a.category_id = b.category_id
WHERE b.category = 'Category A'
The idea is that we could show the user b.category, they select it in Tableau and then the inner join would kick off the partition and limit the bytes returned. When I try this in the BQ interface, the estimated returned size comes back the same.
You'll need to filter on the partitioned field before you make the inner join.
I haven't used tableau before so don't know if this is possible but just an idea. You could create a parameter which is set by the chosen category in tableau, which could be referenced in the where statement of the partitioned table?
SELECT *
FROM table a
INNER JOIN (
SELECT DISTINCT category_id, category
FROM table
Where category = #chosen_category
) b
ON a.category_id = b.category_id;
When you say that your attempts to filter only by category, the partition isn't used, have you actually tested querying the table from the console to test whether the partition is being used or not. If it isn't then you need to look at the partition, but if it is, then you would need to take another look at your Tableau query.
VizQL (Viz query language) is Tableau's sql parser that converts your Tableau viz into SQL for execution, so whilst you cannot really modify the outgoing SQL, you can at least capture it and test which enables you to identify poor performing calculations and/or vizzes, as well as optimise the backend for the queries that Tableau will send.
I've written an article about this here: https://datawonders.atlassian.net/wiki/spaces/TABLEAU/pages/1290600449/Let+s+Talk+Errors+Tuning+6+minute+read
The thing about Tableau is that it treats the source as a derived table, with all filters being placed at the upper-level of the query immediately before the stream,
so your query:
Select *
From table a
Join (
Select Distinct Category_ID, Category
From table
)b On a.category_id = b.category_id
Where b.category = 'Category A'
Will actually look like this (assuming you just select everything):
Select a1.*
From (
Select *
From table a
Join (
Select Distinct Category_ID, Category
From table
)b On a.category_id = b.category_id
)a1
Where a1.category = <your selected category>
So you can see from here that being two-levels deep, your Category table just won't be hit, instead everything shall be read into the spool, the join taking place in tempdb, and only the complete set is filtered immediately before streaming to Tableau.
Bad, underperforming sql it most certainly is.
And this is where the relational method of v2020.2 comes into play, as this has been designed to treat each table as a separate exclusive entity, joins are only made at execution time, so you could build a view that uses data from table a where you are using table b to provide the filtering.
As an alternative, and my preferred overall method is to switch entirely to Custom SQL, utilising this with parameters, as this will enable you to craft and test your own sql to create your own high-performance, low-loading query, but as parameters are parsed before the query is executed, you can place the filtering deep down in the query without the need for a secondary look-up table or filtered derived statement - a select distinct as you are currently using it is still going to produce a large plan, as unless the category column is indexed, the engine shall still need to read every record from the table.
So using parameters, your new query will look something like:
Select a1.*
From (
Select *
From table a
Join lookup_table b On On a.category_id = b.category_id
And b.category = <parameters.pCategory>
)a1
(I've placed the filter condition directly onto the join as this can improve performance in some circumstances, though this actually shouldn't make much difference)
And when used in conjunction with the Set parameter action, you can now use parameters as in/out updateable variables which shall update as the user interacts directly with the viz, instead of the user needing to manually update as they go. If you haven't used these before, I wrote an article about it here: https://community.tableau.com/s/news/a0A4T00000313S0UAI/psst-have-you-had-a-go-with-variables-in-tableau-yet
Steve

Inner join on fields with different domains

I have a query which looks a bit like this:
SELECT a~matnr AS material,
b~aedat AS date,
SUM( c~ormng ) AS dummy
INTO TABLE #gt_dummy
FROM ekpo AS a
INNER JOIN ekko AS b
ON b~ebeln = a~ebeln
INNER JOIN lips AS c
ON c~vgbel = a~ebeln
AND c~vgpos = a~ebelp
INNER JOIN likp AS d
ON d~vbeln = c~vbeln
WHERE a~matnr IN #gr_dummy1
AND a~werks IN #gr_dummy2
GROUP BY a~matnr, b~aedat
ORDER BY a~matnr, b~aedat.
It's not going to work because LIPS-VGPOS and EKPO-EBELP have different domains so '00010' in EBELP would be '000010' in VGPOS. An alternative to this would be to just get the EKPO data, use a conversion exit function and then use a FOR ALL ENTRIES query to get the LIPS entries. Then since you can't use SUM and GROUP BY with FOR ALL ENTRIES I would need to do the summations manually.
Of course it's not a huge amount of work to do all this but I'm interested if there's a quicker way to do this e.g. in a single query? Thanks.
EDIT: We're on 7.40
Unfortunately, I only see two possibilities before ABAP 7.50:
FOR ALL ENTRIES as you suggested
Execute "native" SQL directly on the database connected to your ABAP software. This can be done with EXEC SQL or ADBC (class CL_SQL_STATEMENT and so on), or AMDP if your database is HANA.
It's not your version but for ABAP >= 7.50, there are SQL string functions LIKE for instance SUBSTRING:
SELECT a~ebeln, a~ebelp, c~vbeln, c~posnr
FROM ekpo AS a
INNER JOIN lips AS c
ON c~vgbel = a~ebeln
AND substring( c~vgpos, 2, 5 ) = a~ebelp
INTO TABLE #DATA(gt_dummy).
If you have at least ehp5 on 7.40, you can use CDS views in a workaround for FOR ALL ENTRIES with SUM.
Join EKKO and EKPO in OpenSQL
Create a CDS view on LIPS using fields VGBEL, VGPOS, SUM(ORMNG), with GROUP BY on the first two
Call this CDS view with FOR ALL ENTRIES

Pass int from outer query into OPENQUERY used as subquery

I am trying to improve the performance of a very large and complex query. Below is the relevant portions. I pass the id to the where clause and get back an orderid. I need to get the order comments from another database on a linked server. I understand that I have to pass the query string to OpenQuery, it cannot have dynamic values. In my example I've hard coded it.
How do I get the S.OrderId value and pass it to the OpenQuery? I've tried some of the example but none do this with a subquery. Declare and Set throw errors inside of my main SELECT.
SELECT S.ID AS Id
, S.OrderID
, (SELECT * FROM OPENQUERY(SQL2014A, 'SELECT TOP 1 SECCOMMENT FROM TMWAMS.dbo.ORDERSEC WHERE ORDERID = 1515552')) AS COMMENTS
FROM ShopPO S WHERE ID = 230

Specifying SELECT, then joining with another table

I just hit a wall with my SQL query fetching data from my MS SQL Server.
To simplify, say i have one table for sales, and one table for customers. They each have a corresponding userId which i can use to join the tables.
I wish to first SELECT from the sales table where say price is equal to 10, and then join it on the userId, in order to get access to the name and address etc. from the customer table.
In which order should i structure the query? Do i need some sort of subquery or what do i do?
I have tried something like this
SELECT *
FROM Sales
WHERE price = 10
INNER JOIN Customers
ON Sales.userId = Customers.userId;
Needless to say this is very simplified and not my database schema, yet it explains my problem simply.
Any suggestions ? I am at a loss here.
A SELECT has a certain order of its components
In the simple form this is:
What do I select: column list
From where: table name and joined tables
Are there filters: WHERE
How to sort: ORDER BY
So: most likely it was enough to change your statement to
SELECT *
FROM Sales
INNER JOIN Customers ON Sales.userId = Customers.userId
WHERE price = 10;
The WHERE clause must follow the joins:
SELECT * FROM Sales
INNER JOIN Customers
ON Sales.userId = Customers.userId
WHERE price = 10
This is simply the way SQL syntax works. You seem to be trying to put the clauses in the order that you think they should be applied, but SQL is a declarative languages, not a procedural one - you are defining what you want to occur, not how it will be done.
You could also write the same thing like this:
SELECT * FROM (
SELECT * FROM Sales WHERE price = 10
) AS filteredSales
INNER JOIN Customers
ON filteredSales.userId = Customers.userId
This may seem like it indicates a different order for the operations to occur, but it is logically identical to the first query, and in either case, the database engine may determine to do the join and filtering operations in either order, as long as the result is identical.
Sounds fine to me, did you run the query and check?
SELECT s.*, c.*
FROM Sales s
INNER JOIN Customers c
ON s.userId = c.userId;
WHERE s.price = 10

In an EXISTS can my JOIN ON use a value from the original select

I have an order system. Users with can be attached to different orders as a type of different user. They can download documents associated with an order. Documents are only given to certain types of users on the order. I'm having trouble writing the query to check a user's permission to view a document and select the info about the document.
I have the following tables and (applicable) fields:
Docs: DocNo, FileNo
DocAccess: DocNo, UserTypeWithAccess
FileUsers: FileNo, UserType, UserNo
I have the following query:
SELECT Docs.*
FROM Docs
WHERE DocNo = 1000
AND EXISTS (
SELECT * FROM DocAccess
LEFT JOIN FileUsers
ON FileUsers.UserType = DocAccess.UserTypeWithAccess
AND FileUsers.FileNo = Docs.FileNo /* Errors here */
WHERE DocAccess.UserNo = 2000 )
The trouble is that in the Exists Select, it does not recognize Docs (at Docs.FileNo) as a valid table. If I move the second on argument to the where clause it works, but I would rather limit the initial join rather than filter them out after the fact.
I can get around this a couple ways, but this seems like it would be best. Anything I'm missing here? Or is it simply not allowed?
I think this is a limitation of your database engine. In most databases, docs would be in scope for the entire subquery -- including both the where and in clauses.
However, you do not need to worry about where you put the particular clause. SQL is a descriptive language, not a procedural language. The purpose of SQL is to describe the output. The SQL engine, parser, and compiler should be choosing the most optimal execution path. Not always true. But, move the condition to the where clause and don't worry about it.
I am not clear why do you need to join with FileUsers at all in your subquery?
What is the purpose and idea of the query (in plain English)?
In any case, if you do need to join with FileUsers then I suggest to use the inner join and move second filter to the WHERE condition. I don't think you can use it in JOIN condition in subquery - at least I've never seen it used this way before. I believe you can only correlate through WHERE clause.
You have to use aliases to get this working:
SELECT
doc.*
FROM
Docs doc
WHERE
doc.DocNo = 1000
AND EXISTS (
SELECT
*
FROM
DocAccess acc
LEFT OUTER JOIN
FileUsers usr
ON
usr.UserType = acc.UserTypeWithAccess
AND usr.FileNo = doc.FileNo
WHERE
acc.UserNo = 2000
)
This also makes it more clear which table each field belongs to (think about using the same table twice or more in the same query with different aliases).
If you would only like to limit the output to one row you can use TOP 1:
SELECT TOP 1
doc.*
FROM
Docs doc
INNER JOIN
FileUsers usr
ON
usr.FileNo = doc.FileNo
INNER JOIN
DocAccess acc
ON
acc.UserTypeWithAccess = usr.UserType
WHERE
doc.DocNo = 1000
AND acc.UserNo = 2000
Of course the second query works a bit different than the first one (both JOINS are INNER). Depeding on your data model you might even leave the TOP 1 out of that query.