Create a hardcoded "mapping table" in Trino SQL - sql

I have a query (several CTEs) that get data from different sources. The output has a column name, but I would like to map this nameg to a more user-friendly name.
Id
name
1
buy
2
send
3
paid
I would like to hard code somewhere in the query (in another CTE?) a mapping table. Don't want to create a separate table for it, just plain text.
name_map=[('buy', 'Item purchased'),('send', 'Parcel in transit'), ('paid', 'Charge processed')]
So output table would be:
Id
name
1
Item purchased
2
Parcel in transit
3
Charge processed
In Trino I see the function map_from_entries and element_at, but don't know if they could work in this case.
I know "case when" might work, but if possible, a mapping table would be more convenient.
Thanks

As a simpler alternative to the other answer, you don't actually need to create an intermediate map using map_from_entries and look up values using element_at. You can just create an inline mapping table with VALUES and use a regular JOIN to do the lookups:
WITH mapping(name, description) AS (
VALUES
('buy', 'Item purchased'),
('send', 'Parcel in transit'),
('paid', 'Charge processed')
)
SELECT description
FROM t JOIN mapping ON t.name = mapping.name
(The query assumes your data is in a table named t that contains a column named name to use for the lookup)

Super interesting idea, and I think I got it working:
with tmp as (
SELECT *
FROM (VALUES ('1', 'buy'),
('2', 'send'),
('3', 'paid')) as t(id, name)
)
SELECT element_at(name_map, name) as name
FROM tmp
JOIN (VALUES map_from_entries(
ARRAY[('buy', 'Item purchased'),
('send', 'Parcel in transit'),
('paid', 'Charge processed')])) as t(name_map) ON TRUE
Output:
name
Item purchased
Parcel in transit
Charge processed
To see a bit more of what's happening, we can look at:
SELECT *, element_at(name_map, name) as name
id
name
name_map
name
1
buy
{buy=Item purchased, paid=Charge processed, send=Parcel in transit}
Item purchased
2
send
{buy=Item purchased, paid=Charge processed, send=Parcel in transit}
Parcel in transit
3
paid
{buy=Item purchased, paid=Charge processed, send=Parcel in transit}
Charge processed
I'm not sure how efficient this is, but it's certainly an interesting idea.

Related

SQL query with many 'AND NOT CONTAINS' statements

I am trying to exclude timezones that have a substring in them so I only have records likely from the US.
The query works fine (e.g., the first line after the OR will remove local_timezones that include 'Africa/Abidjan'), but there's got to be a better way to write it.
It's too verbose, repetitive, and I suspect it's slower than it could be. Any advice greatly appreciated. (I'm using Snowflake's flavor of SQL but not sure that matters in this case).
NOTE: I'd like to keep a timezone such as America/Los_Angeles, but not America/El_Salvador, so for this reason I don't think wildcards are a good solution.
SELECT a_col
FROM a_table
WHERE
(country = 'United States')
OR
((country is NULL and not contains (local_timezone, 'Africa')
AND
country is NULL and not contains (local_timezone, 'Asia')
AND
country is NULL and not contains (local_timezone, 'Atlantic')
AND
country is NULL and not contains (local_timezone, 'Australia')
AND
country is NULL and not contains (local_timezone, 'Etc')
AND
country is NULL and not contains (local_timezone, 'Europe')
AND
country is NULL and not contains (local_timezone, 'Araguaina')
etc etc
If you have a known list of "good things" I would make a table, and then just JOIN to id. Here I made you a list of good timezones:
CREATE TABLE acceptable_timezone (tz_name text) AS
SELECT * FROM VALUES
('Pacific/Auckland'),
('Pacific/Fiji'),
('Pacific/Tahiti');
I love me some Pacific... now we have some important data in a CTE
WITH data(id, timezone) AS (
SELECT * FROM VALUES
(1, 'Pacific/Auckland'),
(2, 'Pacific/Fiji'),
(3, 'America/El_Salvador')
)
SELECT d.*
FROM data AS d
JOIN acceptable_timezone AS a
ON a.tz_name = d.timezone
ORDER BY 1;
which total does not match the El Salvador:
ID
TIMEZONE
1
Pacific/Auckland
2
Pacific/Fiji
You cannot get much faster than an equijoin, but if your DATA has the timezones as substrings, then the TABLE can have the wildcard matches % and you can use a LIKE just like Felipe's answer does but as
JOIN acceptable_timezone AS a
ON d.timezone LIKE a.tz_name
You can use LIKE ANY:
with data as
(select null country, 'something Australia maybe' local_timezone)
select *
from data
where country = 'United States'
or (
country is null
and not local_timezone like any ('%Australia%', '%Africa%', '%Atlantic%')
)

Exclude returned data when specific words or phrases exist

An example follows of data that is being returned.
ID
CensoredWord
DescriptionSnippet
1
anus
anus
2
anus
manuscript submitted
3
anus
tetanus vaccination
4
anus
oceanus proposal
5
rere
prerequisite includes
The Description Snippet contains the censored word within another word or within a phrase and could be multiple sentences long.
I'd like to exclude data from being returned when the word is anus and the snippet contains the word tetanus or manuscript or oceanus and likewise with the word rere and the snippet contains prerequisite.
I've attempted various methods around WHERE
CensoredWord = 'anus' and DescriptionSnippit NOT LIKE '%tetanus%'
OR CensoredWord = 'anus' and DescriptionSnippit NOT LIKE '%manuscript%'
OR CensoredWord = 'anus' and DescriptionSnippit NOT LIKE '%oceanus%'
OR CensoredWord = 'rere' and DescriptionSnippit NOT LIKE '%prerequisite%'
But I am coming up short. What should this look like?
Assuming that you just don't want the sentence to contain the censored word, but ignore words that contain it.
Then this will work for most SQL dialects.
But it's not perfect.
F.e. it won't find anus!
select *
from test
where concat(' ',description_snippet,' ') not like
concat('% ',censored_word,' %')
Some RDBMS's have functions that accept regular expressions. Which gives more flexibility.
F.e. the use of word-boundaries.
Here's an example that works in Postgresql
Test
create table test (
ID serial primary key,
censored_word varchar(30),
description_snippet varchar(30)
);
insert into test (id, censored_word, description_snippet) values
(1, 'anus', 'anus')
, (2, 'anus', 'manuscript submitted')
, (3, 'anus', 'tetanus vaccination')
, (4, 'anus', 'oceanus proposal')
, (5, 'rere', 'prerequisite includes')
, (6, 'rere', 'no rere without anus')
select *
from test
where description_snippet !~ concat('\m(', censored_word, ')\M')
id
censored_word
description_snippet
2
anus
manuscript submitted
3
anus
tetanus vaccination
4
anus
oceanus proposal
5
rere
prerequisite includes
db<>fiddle here
You can use a regexp that searched for description_snippets that have at least one letter before or after the censored_word.
select * from test where lower(description_snippet) regexp lower(concat("[[:alpha:]]",censored_word,"|",censored_word,"[[:alpha:]]"));
Or use like
select * from test where lower(description_snippet) like (concat('%',lower(censored_word))) or lower(description_snippet) like(concat(lower(censored_word),"%"));
http://sqlfiddle.com/#!9/a471f3/7

How to show elements from 2 tables ( existing in one , another or both)

I have 2 tables than i need to fuse together for data analysis.
Table One ( shows year consumption of items with values, from a contract)
Table One fields : product code, quantity, total value, contract number
Table Two (shows contract defined included products)
Table Two fields : included product code, included quantity, total included value, contract number
I need to join both of them so that shows per contract, all the related products, both consumed or included, so that shows either i only have consumed but not included, included but not consumed and included and consumed...
Something like this :
Contract|Product Code|Consumed qty|Included Qty|Consumed Total|Included Total
CTC001|X0001|55|45|550|450
CTC001|X0002|20|NULL|200|NULL
CTC001|X0003|NULL|10|NULL|100
CTC002|X0001|10|10|100|100
Using inner join only shows the ones on both tables
Using left or right joins shows all from one table and similar and null's from other table...
My goal was to show from both tables, has the example
Any help or tip ?
(this is my current query, field names not all equal as example, but you get the idea :
SELECT dbo.USR_View_ArtIncludContr.strCodArtigo, dbo.USR_View_TotaisConsumos.strCodArtigo AS Expr2, dbo.USR_View_TotaisConsumos.QTDTOTAL,
dbo.USR_View_ArtIncludContr.fltQuantLimiteInc, dbo.USR_View_TotaisConsumos.VALORTOTAL, dbo.USR_View_ArtIncludContr.Total, dbo.USR_View_TotaisConsumos.strCodSecContrato,
dbo.USR_View_TotaisConsumos.strCodTpContrato, dbo.USR_View_TotaisConsumos.strCodExercContrato, dbo.USR_View_TotaisConsumos.intNumeroContrato, dbo.USR_View_ArtIncludContr.strCodSeccao,
dbo.USR_View_ArtIncludContr.strCodTpContrato AS Expr1, dbo.USR_View_ArtIncludContr.strCodExercicio, dbo.USR_View_ArtIncludContr.intNumero
FROM dbo.USR_View_ArtIncludContr INNER JOIN
dbo.USR_View_TotaisConsumos ON dbo.USR_View_ArtIncludContr.strCodSeccao = dbo.USR_View_TotaisConsumos.strCodSecContrato AND
dbo.USR_View_ArtIncludContr.strCodTpContrato = dbo.USR_View_TotaisConsumos.strCodTpContrato AND
dbo.USR_View_ArtIncludContr.strCodExercicio = dbo.USR_View_TotaisConsumos.strCodExercContrato AND dbo.USR_View_ArtIncludContr.intNumero = dbo.USR_View_TotaisConsumos.intNumeroContrato AND
dbo.USR_View_ArtIncludContr.strCodArtigo = dbo.USR_View_TotaisConsumos.strCodArtigo
Sounds like you want a full join:
SELECT aic.strCodArtigo, tc.strCodArtigo AS Expr2, tc.QTDTOTAL,
aic.fltQuantLimiteInc, tc.VALORTOTAL, aic.Total, tc.strCodSecContrato,
tc.strCodTpContrato, tc.strCodExercContrato, tc.intNumeroContrato, aic.strCodSeccao,
aic.strCodTpContrato AS Expr1, aic.strCodExercicio, aic.intNumero
FROM dbo.USR_View_ArtIncludContr aic FULL JOIN
dbo.USR_View_TotaisConsumos tc
ON aic.strCodSeccao = tc.strCodSecContrato AND
aic.strCodTpContrato = tc.strCodTpContrato AND
aic.strCodExercicio = tc.strCodExercContrato AND
aic.intNumero = tc.intNumeroContrato AND
aic.strCodArtigo = tc.strCodArtigo
Notice that column aliases make the query much easier to write and to read.
Just figure out an workaround...
i could use CASE WHEN for the contract and for product code...
Something like this in select :
"CASE WHEN AIC.strCodArtigo IS NULL THEN TC.strCodArtigo WHEN TC.strCodArtigo IS NULL THEN AIC.strCodArtigo ELSE AIC.strCodArtigo END AS ARTIGO "
And use FULL OUTER JOIN
In case anyone has a better way, i appreciate any opinion

Postgresql query for every day sold stock count

I have project on CRM which maintains product sales order for every organization.
I want to count everyday sold stock which I have managed to do by looping over by date but obviously it is a ridiculous method and taking more time and memory.
Please help me to find out it in single query. Is it possible?
Here is my database structure for your reference.
product : id (PK), name
organization : id (PK), name
sales_order : id (PK), product_id (FK), organization_id (FK), sold_stock, sold_date(epoch time)
Expected Output for selected month :
organization | product | day1_sold_stock | day2_sold_stock | ..... | day30_sold_stock
http://sqlfiddle.com/#!15/e1dc3/3
Create tablfunc :
CREATE EXTENSION IF NOT EXISTS tablefunc;
Query :
select "proId" as ProductId ,product_name as ProductName,organizationName as OrganizationName,
coalesce( "1-day",0) as "1-day" ,coalesce( "2-day",0) as "2-day" ,coalesce( "3-day",0) as "3-day" ,
coalesce( "4-day",0) as "4-day" ,coalesce( "5-day",0) as "5-day" ,coalesce( "6-day",0) as "6-day" ,
coalesce( "7-day",0) as "7-day" ,coalesce( "8-day",0) as "8-day" ,coalesce( "9-day",0) as "9-day" ,
coalesce("10-day",0) as "10-day" ,coalesce("11-day",0) as "11-day" ,coalesce("12-day",0) as "12-day" ,
coalesce("13-day",0) as "13-day" ,coalesce("14-day",0) as "14-day" ,coalesce("15-day",0) as"15-day" ,
coalesce("16-day",0) as "16-day" ,coalesce("17-day",0) as "17-day" ,coalesce("18-day",0) as "18-day" ,
coalesce("19-day",0) as "19-day" ,coalesce("20-day",0) as "20-day" ,coalesce("21-day",0) as"21-day" ,
coalesce("22-day",0) as "22-day" ,coalesce("23-day",0) as "23-day" ,coalesce("24-day",0) as "24-day" ,
coalesce("25-day",0) as "25-day" ,coalesce("26-day",0) as "26-day" ,coalesce("27-day",0) as"27-day" ,
coalesce("28-day",0) as "28-day" ,coalesce("29-day",0) as "29-day" ,coalesce("30-day",0) as "30-day" ,
coalesce("31-day",0) as"31-day"
from crosstab(
'select hist.product_id,pr.name,o.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),sum(sold_stock)
from sales_order hist
left join product pr on pr.id = hist.product_id
left join organization o on o.id = hist.organization_id
where EXTRACT(MONTH FROM TO_TIMESTAMP(hist.sold_date/1000)) =5
and EXTRACT(YEAR FROM TO_TIMESTAMP(hist.sold_date/1000)) = 2017
group by hist.product_id,pr.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),o.name
order by o.name,pr.name',
'select d from generate_series(1,31) d')
as ("proId" int ,product_name text,organizationName text,
"1-day" float,"2-day" float,"3-day" float,"4-day" float,"5-day" float,"6-day" float
,"7-day" float,"8-day" float,"9-day" float,"10-day" float,"11-day" float,"12-day" float,"13-day" float,"14-day" float,"15-day" float,"16-day" float,"17-day" float
,"18-day" float,"19-day" float,"20-day" float,"21-day" float,"22-day" float,"23-day" float,"24-day" float,"25-day" float,"26-day" float,"27-day" float,"28-day" float,
"29-day" float,"30-day" float,"31-day" float);
Please note, use PostgreSQL Crosstab Query. I have used coalesce for handling null values(Crosstab Query to show "0" when there is null data to return).
Following query will help to find the same:
select o.name,
p.name,
sum(case when extract (day from to_timestamp(sold_date))=1 then sold_stock else 0 end)day1_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=2 then sold_stock else 0 end)day2_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=3 then sold_stock else 0 end)day3_sold_stock,
from sales_order so,
organization o,
product p
where so.organization_id=o.id
and so.product_id=p.id
group by o.name,
p.name;
I just provided logic to find for 3 days, you can implement the same for rest of the days.
basically first do basic joins on id, and then check if each date(after converting epoch to timestamp and then extract day).
You have a few options here but it is important to understand the limitations first.
The big limitation is that the planner needs to know the record size before the planning stage, so this has to be explicitly defined, not dynamically defined. There are various ways of getting around this. At the end of the day, you are probably going to have somethign like Bavesh's answer, but there are some tools that may help.
Secondly, you may want to aggregate by date in a simple query joining the three tables and then pivot.
For the second approach, you could:
You could do a simple query and then pull the data into Excel or similar and create a pivot table there. This is probably the easiest solution.
You could use the tablefunc extension to create the crosstab for you.
Then we get to the first problem which is that if you are always doing 30 days, then it is easy if tedious. But if you want to do every day for a month, you run into the row length problem. Here what you can do is create a dynamic query in a function (pl/pgsql) and return a refcursor. In this case the actual planning takes place in the function and the planner doesn't need to worry about it on the outer level. Then you call FETCH on the output.

Extract multiple strings from a free text field

Let's say I have a free text field called 'Note' and contains "ABC:5/52 , *back, orders received"
How do I extract '5/52' and 'back' and place them in two separate columns?
Here's what I wanted to achieve
QUERY:-
SELECT *, SUBSTRING(Note, CHARINDEX(':', Note)+1, 4) as ABC,
SUBSTRING(Note, CHARINDEX('*', Note)+1, 4) as Ret_Stat
, CHARINDEX(':', Note) AS [Colon Index]
FROM [AdventureWorks2012].[Sales].[Comments]
RESULT:-
Note ABC Ret_Stat
ABC:3/52, To give more explanation, *back 3/52 back
ABC:3wks, To debrief, *back, r/v 3wks back
ABC:13/09/16, see cm, *back, new referral 13/0 back
My issue is i wanted to extract 3/52, 3wks, and 13/09/16 but my end result's only 13/10.
I'd like to ask how to achieve this? as the condition of extraction may vary from 4 to 8 characters after ABC: and the table contains thousands of rows of data
Need advice. THank you.
Here's an example of what you want to do. You may have to modify this a little as I don't have a lot of sample data to go on.
Test Data
IF OBJECT_ID('tempdb..#TempData') IS NOT NULL DROP TABLE #TempData
GO
CREATE TABLE #TempData (Notes varchar(100))
INSERT INTO #TempData (Notes)
VALUES
('This is the first "ABC:5/52 string *back, orders received')
,('*back, orders receivedThis"ABC:5/52 string is the second one')
,('You guessed it, this *back, orders received is the third "ABC:5/52 string')
Query
SELECT
CHARINDEX('*',Notes) AsteriskLocation
,SUBSTRING(Notes,CHARINDEX('*',Notes)+1,4) AfterAsterisk
,CHARINDEX(':',Notes) ColonLocation
,SUBSTRING(Notes,CHARINDEX(':',Notes)+1,4) AfterColon
FROM #TempData
Result
AsteriskLocation AfterAsterisk ColonLocation AfterColon
36 back 23 5/52
1 back 31 5/52
22 back 62 5/52
I've left the locations separately so that you can see how they're used in the query. You could search for strings too using the same method.