Finding Substring in HIVE - hive

I am trying to get substring values from a String(composite_key):
My composite_key looks like as mentioned below:
string1|string2|string3|string4|string5|string6|string7
I am able to find string1, string2,string3,string4 and string5 using substring method of impala
Can someone please help me finding String6 and String7 using substring method?
Any help will be appericiated.

You have use Hive subquery + array data structure + split function to accomplish this. However, this only works in Hive. The Impala does not support nested data structure yet, except for parquet based table in Impala 2.3 (corresponding to CDH 5.5) and higher.
select
key_array[0] part0,
key_array[1] part1,
key_array[2] part2,
key_array[3] part3,
key_array[4] part4,
key_array[5] part5,
key_array[6] part6,
from (
select split(composite_key,'|') as key_array
from mytable
) as temp

I was able to do it using below queries:
For String7
select substring(composite_key,locate('|',composite_key,locate('|',composite_key,locate('|',composite_key,locate('|',composite_key,locate('|',composite_key, locate('|',composite_key) + 1)+1)+1)+1)+1)+1)as a
For String6
select
substring(composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key) + 1)+1)+1)+1)+1,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key) + 1)+1)+1)+1)+1)
- locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key,
locate('|',composite_key) + 1)+1)+1)+1)-1)
as a

Related

How can your current SQL query refer back to a previous query without writing aliases in the query?

I am taking the Google Data Analytics course on Coursera and in the video the instructor executed the following query:
SELECT
Date,
Region,
Small_Bags,
Large_Bags,
XLarge_Bags,
Total_Bags,
Small_Bags + Large_Bags + XLarge_Bags AS Total_Bags_Calc
FROM
`class-5-355317.avocado_data.avocado_prices`
After executing this query they then opened a different editor window on BigQuery and executed the following query referring to an alias in the query above without defineing it:
SELECT
*
FROM
`class-5-355317.avocado_data.avocado_prices`
WHERE
Total_Bags != Total_Bags_Calc
When I executed this query it did not work for me and I received this error: 'Unrecognized name: Total_Bags_Calc; Did you mean Total_Bags?'
This makes sense. Within this query, the alias 'Total_Bags_Calc' hadn't been used within that query and didn't have anything to pull, so I tried a workaround:
SELECT
Date,
Region,
Small_Bags,
Large_Bags,
XLarge_Bags,
Total_Bags,
(SELECT Small_Bags + Large_Bags + XLarge_Bags FROM `class-5355317.avocado_data.avocado_prices`) AS Total_Bags_Calc
FROM `class-5-355317.avocado_data.avocado_prices`
WHERE
Total_Bags != Total_Bags_Calc
From what I understood this should work since the subquery now held the alias 'Total_Bags_Calc' but I still received the error Unrecognized name: Total_Bags_Calc; Did you mean Total_Bags?
How can I make this query work, and is there any way to have a query reference another query in the same manner that theirs did in the example?
You'll want to select FROM the result of your first query, so try moving that subquery into the FROM clause.
For example,
SELECT
*
FROM
{{ your other query goes here }}
WHERE
Total_Bags != Total_Bags_Calc
Which would be:
SELECT
*
FROM
(SELECT
Date,
Region,
Small_Bags,
Large_Bags,
XLarge_Bags,
Total_Bags,
Small_Bags + Large_Bags + XLarge_Bags AS Total_Bags_Calc
FROM
`class-5-355317.avocado_data.avocado_prices`
) as subquery
WHERE
Total_Bags != Total_Bags_Calc
This is a really helpful technique to learn, so definitely learn it. However since you're doing something rather simple, you can actually get away with just coding that logic in your WHERE clause.
SELECT
Date,
Region,
Small_Bags,
Large_Bags,
XLarge_Bags,
Total_Bags,
Small_Bags + Large_Bags + XLarge_Bags AS Total_Bags_Calc
FROM
`class-5-355317.avocado_data.avocado_prices`
WHERE (Small_Bags + Large_Bags + XLarge_Bags) <> Total_Bags
Good Morning Tituslcuster!
I think I have spotted the issue.
Your query is creating an column named Total_Bags_Calc
When your code reaches the Where statement stuff, Total_Bags_Calc is the part that is breaking your code.
This is because it doesn't exist in the From table, but it does exist as a temporary name.
You can do two different things to fix this..
You can subquery this whole query, and do the Where Total_Bags_Calc on the outside query.
Or you can replace the Total_Bags_Calc with the actual formula that you used to calculate Total_Bags_Calc.
Here try this one:
select x2.* from (
SELECT
Date,
Region,
Small_Bags,
Large_Bags,
XLarge_Bags,
Total_Bags,
( Small_Bags + Large_Bags + XLarge_Bags ) AS Total_Bags_Calc
FROM 'class-5-355317.avocado_data.avocado_prices'
) as x2
where x2.Total_Bags != x2.Total_Bags_Calc

trouble with dynamic year filter in MDX

I have the following query that I use in Azure data factory(this is on the source of a copy action):
SELECT
{ [Measures].[0INV_QTY],
[Measures].[0NET_VAL_S] }
ON COLUMNS,
NON EMPTY
{ [0CUST_SALES].[LEVEL01].MEMBERS *
[0SALESORG].[LEVEL01].MEMBERS *
[0COMPANY].[LEVEL01].MEMBERS *
[0MATERIAL].[LEVEL01].MEMBERS *
[ZDEBITOR].[LEVEL01].MEMBERS *
[0FISCPER].[LEVEL01].MEMBERS *
[0DEB_CRED].[LEVEL01].MEMBERS *
[0BILLTOPRTY].[LEVEL01].MEMBERS *
[0DOC_CATEG].[LEVEL01].MEMBERS *
[0SHIP_TO].[LEVEL01].MEMBERS }
DIMENSION PROPERTIES
[0SALESORG].[20SALESORG],
[0COMPANY].[20COMPANY],
[0CUST_SALES].[80CUST_SALES],
[0CUST_SALES].[20CUST_GRP1],
[0CUST_SALES].[20PMNTTRMS],
[ZDEBITOR].[20CRED_LIMIT],
[0MATERIAL].[20MATERIAL],
[0DEB_CRED].[20DEB_CRED],
[0BILLTOPRTY].[20BILLTOPRTY],
[0DOC_CATEG].[20DOC_CATEG],
[0SHIP_TO].[20SHIP_TO],
[0FISCPER].[80FISCPER]
ON ROWS
FROM $0SD_C03
WHERE ({[0CALYEAR].[2020], [0CALYEAR].[2021], [0CALYEAR].[2018], [0CALYEAR].[2019]})
In here I would like to replace the WHERE with something like Cast(YEAR(GETDATE())-4 as varchar(10)) Now I am really new to MDX and I keep getting stuck. Could anyone point me in the right direction?
So what i want to achieve is not having to adjust the query every year and be able to only have the last 4 years.
If you are looking for a SQL equivalent as below in MDX
SELECT ... From ... WHERE date > DATEADD(year,-4,GETDATE())
Try using "with member" and function parallelperiod.
CREATE MEMBER CurrentCube.Measures.[Last4Years] AS
ParallelPeriod( [Date].[Date].[Date Yr], 4, StrToMember(“[Date].[Date].&[” + Format(now(), “yyyyMMdd”) + “]”))
: StrToMember(“[Date].[Date].&[” + Format(now(), “yyyyMMdd”) + “]”)
;

Multi Value in Query Select PrestoDB

I'm trying to perfom some query to Prestodb like this:
SELECT *
FROM (
VALUES
concat('<a href="https:test?preselect_filters=',
url_encode(concat('{"27":{','"column_name":["', CAST({{ "'" +
"','".join(filter_values('column_name', concat('#value1', '#value2'))) +
"'" }} AS VARCHAR), '"]', ',','}}')),
'#TAB-bEGT8cU5h">Qun', '</a>')
) dashboard_link (links)
my problem is concat('#value1','#value2'), any idea to select an array of values?
Is this what you need?
select * from (values concat(ARRAY [1,2,3], ARRAY [4,5,6]));
You can refer to prestodb doc

How to Convert SQL to Laravel Query

I want to build something with an aliases column like below, but I don't know how to make it a Laravel query. The following is my SQL.
SELECT
d.*,
d.damaged_building,
d.total_victim
FROM
(
SELECT
delete_flg,
damage_id,
create_time,
reg_user_name,
municipality_id,
report_time,
district,
village,
disaster_type,
cause_of_disaster,
(
bd_major_damage1 + bd_minor_damage1 + bd_major_damage2 + bd_minor_damage2 + bd_major_damage3 + bd_minor_damage3
)
as damaged_building,
(
hi_hd_deaths + hi_hd_serious_injuries + hi_hd_minor_injuries + hi_hd_missing_persons + hi_hd_sick_persons
)
as total_victim
FROM
d_damage
WHERE
delete_flg = 0
ORDER BY
create_time DESC
)
d
I want to translate this to Eloquent of Laravel Query Builder for use in my controller.
You may try with this
DB::table('d_damage')
->where('delete_flg',0)
->orderBy('create_time', 'desc')
->select('delete_flg', 'damage_id', 'create_time', 'reg_user_name', 'municipality_id', 'report_time', 'district', 'village', 'disaster_type', 'cause_of_disaster',DB::raw('(bd_major_damage1 + bd_minor_damage1 + bd_major_damage2 + bd_minor_damage2 + bd_major_damage3 + bd_minor_damage3) as damaged_building'),DB::raw('(hi_hd_deaths + hi_hd_serious_injuries + hi_hd_minor_injuries + hi_hd_missing_persons + hi_hd_sick_persons) as total_victim'))
->get();
this may be not exactly but almost
If I were you just stick to learning the SQL Queries since you bring it anywhere you can learn alot of frameworks / tools, rather than investing your time in Laravel's. Anyway, you still can use sql queries in laravel using the raw expressions : https://laravel.com/docs/5.8/queries#raw-expressions

SQL query concatenation to create file path

I need to create a file path from 3 columns in a SQL query. This will be utilized in a file once everything is completed. I have tried using CONCAT and string methods for the columns but no luck. The query is provided below.
SELECT
dbo.TBIndexData.DocGroup,
dbo.TBIndexData.Index1 AS Title,
dbo.TBIndexData.Index2 AS Meeting_Date,
dbo.TBIndexData.Index3 AS Meeting_Number,
dbo.TBIndexData.Index4 AS Meeting_Type,
dbo.TBIndexData.Index5 AS Doc_Name,
dbo.TBIndexData.Index6 AS Doc_Type,
dbo.TBIndexData.Index7 AS Meeting_Page,
dbo.TBIndexData.Index8 AS Notes,
dbo.TBIndexData.Index9 AS TBUser,
dbo.TBIndexData.Index10 AS Date_Scanned,
CONCAT (dbo.TBPrimary.FileDir + '\' + dbo.TBPrimary.TimsFileID + '.' + dbo.TBPrimary.FileExtension) AS FilePath
FROM
dbo.TBIndexData
JOIN
dbo.TBPrimary ON dbo.TBIndexData.DocGroup = dbo.TBPrimary.DocGroup
In SQL Server 2008 you need something like
SELECT I.DocGroup,
I.Index1 AS Title,
I.Index2 AS Meeting_Date,
I.Index3 AS Meeting_Number,
I.Index4 AS Meeting_Type,
I.Index5 AS Doc_Name,
I.Index6 AS Doc_Type,
I.Index7 AS Meeting_Page,
I.Index8 AS Notes,
I.Index9 AS TBUser,
I.Index10 AS Date_Scanned,
P.FileDir + '\' + CAST(P.TimsFileID AS VARCHAR(10)) +
'.' + P.FileExtension AS FilePath
FROM dbo.TBIndexData I
JOIN dbo.TBPrimary P
ON I.DocGroup = P.DocGroup
You shouldn't use schemaname.tablename in the SELECT list. This is not officially supported grammar. Just use tablename or give an alias.
(Using two part names there can lead to confusing errors if calling properties of columns of CLR datatypes)
Try to use CONCAT with commas
CONCAT (dbo.TBPrimary.FileDir, '\', dbo.TBPrimary.TimsFileID, '.', bo.TBPrimary.FileExtension) AS FilePath