How to count unique integers in a string using hive?

How to count unique integers in a string using hive? - sql

Trying to count the unique bytes in a string?
DATA (Phone numbers for example with only numeric bytes):
1234567890
1111111112
Results:
10
2
I have tried the below and it didn't work because the sum() won't accept the UDF 'if' with in it, I think.
select phone
, sum(
cast(if(length(regexp_replace(phone,'0',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'1',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'2',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'3',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'4',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'5',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'6',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'7',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'8',''))<10,'1','0') as int) +
cast(if(length(regexp_replace(phone,'9',''))<10,'1','0') as int)
) as unique_bytes
from table;
I am not apposed to regular expressions as a solution either.

Use + . . . but like this:
select phone,
((case when phone like '%0%' then 1 else 0 end) +
(case when phone like '%1%' then 1 else 0 end) +
(case when phone like '%2%' then 1 else 0 end) +
(case when phone like '%3%' then 1 else 0 end) +
(case when phone like '%4%' then 1 else 0 end) +
(case when phone like '%5%' then 1 else 0 end) +
(case when phone like '%6%' then 1 else 0 end) +
(case when phone like '%7%' then 1 else 0 end) +
(case when phone like '%8%' then 1 else 0 end) +
(case when phone like '%9%' then 1 else 0 end) +
) as ints
from table;
Your code has several issues:
sum() is an aggregation function and is not needed.
The if() is returning strings, but you are adding the values together.
I'm not sure why you are using regexp_replace() rather than just replace().

with tab1 as (
select stack(3,
'1','1234567890',
'2','1111111112',
'3','2222222223') as (col0, col1))
select tab1.col0, count(distinct tf.col) from tab1 lateral view explode(split(tab1.col1,'')) tf as col
where tf.col regexp '\\d'
group by tab1.col0

Related

error converting % to datatype int in sql

iam using simple query i want year to get extracted and that should be used in table extraction
select *
from v_AuthListInfo LI
where title like '%SUG%'
and Title like '%P1%'
and Title like '%' + '' + year(getdate()) + '' + '%'
I am getting this error
Msg 245, Level 16, State 1, Line 26
Conversion failed when converting the varchar value '%' to data type int.
Ideally it should come 2018 and it should extract those records with 2018
and I want records minus 1 yr means 2017
so in all I want records of 2018 and 2017
but iam not able to get can you tell me where iam going wrong in concatenation
want to combine output of these two queries
select
count(*) [Total Clients], li.title,li.CI_UniqueID,coll.name,
SUM (CASE WHEN ucs.status=3 or ucs.status=1 then 1 ELSE 0 END ) as 'Installed / Not Applicable',
sum( case When ucs.status=2 Then 1 ELSE 0 END ) as 'Required',
sum( case When ucs.status=0 Then 1 ELSE 0 END ) as 'Unknown',
round((CAST(SUM (CASE WHEN ucs.status=3 or ucs.status=1 THEN 1 ELSE 0 END) as float)/count(*) )*100,2) as 'Compliant%',
round((CAST(count(case when ucs.status not in('3','1') THEN '*' end) as float)/count(*))*100,2) as 'NotCompliant%'
From v_Update_ComplianceStatusAll UCS
inner join v_r_system sys on ucs.resourceid=sys.resourceid
inner join v_FullCollectionMembership fcm on ucs.resourceid=fcm.resourceid
inner join v_collection coll on coll.collectionid=fcm.collectionid
inner join v_AuthListInfo LI on ucs.ci_id=li.ci_id
where coll.CollectionID='SMS00001' and
--title like '%SUG%'
Title like '%P1%'
and Title like '%SUG_' + '' + CAST(year(getdate()) as varchar) + '' + '%'
--and Title like '%SUG_' + '' + CAST(year(getdate())-1 as varchar) + '' + '%'
group by li.title,li.CI_UniqueID,coll.name
order by li.title ASC
select
count(*) [Total Clients], li.title,li.CI_UniqueID,coll.name,
SUM (CASE WHEN ucs.status=3 or ucs.status=1 then 1 ELSE 0 END ) as 'Installed / Not Applicable',
sum( case When ucs.status=2 Then 1 ELSE 0 END ) as 'Required',
sum( case When ucs.status=0 Then 1 ELSE 0 END ) as 'Unknown',
round((CAST(SUM (CASE WHEN ucs.status=3 or ucs.status=1 THEN 1 ELSE 0 END) as float)/count(*) )*100,2) as 'Compliant%',
round((CAST(count(case when ucs.status not in('3','1') THEN '*' end) as float)/count(*))*100,2) as 'NotCompliant%'
From v_Update_ComplianceStatusAll UCS
inner join v_r_system sys on ucs.resourceid=sys.resourceid
inner join v_FullCollectionMembership fcm on ucs.resourceid=fcm.resourceid
inner join v_collection coll on coll.collectionid=fcm.collectionid
inner join v_AuthListInfo LI on ucs.ci_id=li.ci_id
where coll.CollectionID='SMS00001' and
--title like '%SUG%'
Title like '%P1%'
-- Title like '%SUG_' + '' + CAST(year(getdate()) as varchar) + '' + '%'
and Title like '%SUG_' + '' + CAST(year(getdate())-1 as varchar) + '' + '%'
group by li.title,li.CI_UniqueID,coll.name
order by li.title ASC

Cast your year to varchar because year returns int value
select * from v_AuthListInfo LI
where title like '%SUG%'
and Title like '%P1%'
and Title like '%' + '' + CAST(year(getdate()) as varchar(4)) + '' + '%'

The problem is that year() returns a number, not a string. Because of this, SQL Server interprets the + as addition, rather than string concatenation.
SQL Server has a convenient function, datename(), that returns a string:
select *
from v_AuthListInfo LI
where title like '%SUG%' and
title like '%P1%' and
title like '%' + datename(year, getdate()) + '%';
The empty strings that you are concatenating in the like pattern are useless.

SQL count string matches in each row

Please take a look at this simple SQL server database :
What I want is, I want to create a summary with only 3 column, here is the code:
select ProductID, Name,
*code* as CountString
from product
where Name in ('this', 'is', 'count', 'example')
I want the result to have 3 column, and the column "CountString" is the total number of string that matches ('this','is', 'count', 'example'). Here is the result I want :
So for example, I want the Countstring for ProductID 1 is 4, because it contains all of 4 words.
If you can solve this, it would be amazing!

If I understand correctly:
select ProductID, Name,
( (case when Name like '%this%' then 1 else 0 end) +
(case when Name like '%is%' then 1 else 0 end) +
(case when Name like '%count%' then 1 else 0 end) +
(case when Name like '%example%' then 1 else 0 end)
) as CountString
from product;
Note: Any Name that has "this" also has "is".
If "words" are separated by spaces (and only spaces), you can do:
select ProductID, Name,
( (case when concat(' ', Name, ' ') like '% this %' then 1 else 0 end) +
(case when concat(' ', Name, ' ') like '% is %' then 1 else 0 end) +
(case when concat(' ', Name, ' ') like '% count %' then 1 else 0 end) +
(case when concat(' ', Name, ' ') like '% example %' then 1 else 0 end)
) as CountString
from product;

The following query should suffice your need ---
SELECT PRODUCTID,
NAME,
REGEXP_COUNT(NAME, 'this|is|count|example', 1, 'c') CountString
FROM product;
This query will result in "Case Sensitive" checking, means only "example" will be counted not "Example". If you want "Case Insensitive" checking just put 'i' instead of 'c'.

How to construct a new column based on other columns in a SELECT

I have this table:
Assets
--------------------------------
Description VARCHAR(50) NOT NULL
Suffix1 VARCHAR(50) NOT NULL
UseSuffix1 BIT NOT NULL
Suffix2 VARCHAR(50) NULL
UseSuffix2 BIT NOT NULL
Suffix3 VARCHAR(50) NULL
UseSuffix3 BIT NOT NULL
I am trying to do a SELECT statement that constructs the following: a VARCHAR(MAX) columns that consists of the Description field, plus the other suffixes appended when required (via the UseSuffixX flag)
examples of input and output :
'MyDesc'
'Suffix1'
0
NULL -> 'MyDesc'
0
NULL
0
-----------------------
'MyDesc'
'Suffix1'
1
NULL -> 'MyDesc - Suffix1'
0
NULL
0
-----------------------
'MyDesc'
'Suffix1'
0
'Suffix2' -> 'MyDesc - Suffix2 - Suffix 3'
1
'Suffix3'
1
-----------------------
'MyDesc'
'Suffix1'
1
'Suffix2' -> 'MyDesc - Suffix1 - Suffix 3'
0
'Suffix3'
1
I started by using a CASE directive in my SELECT like this:
SELECT
[Description] +
CASE
WHEN UseSuffix1 = 1 THEN ' - ' + Suffix1
WHEN UseSuffix2 = 1 THEN ' - ' + Suffix2
WHEN UseSuffix3 = 1 THEN ' - ' + Suffix3
ELSE ''
END
FROM Assets
but quickly realized that I would need to expand the trees of all possibilities in each WHEN branch...not sure if I'm expressing myself correctly here.
What would be the more practical way to do this?

You don't need all the possibilities, just one case per suffix:
SELECT ([Description] +
(CASE WHEN UseSuffix1 = 1 THEN ' - ' + Suffix1 ELSE '' END) +
(CASE WHEN UseSuffix2 = 1 THEN ' - ' + Suffix2 ELSE '' END) +
(CASE WHEN UseSuffix3 = 1 THEN ' - ' + Suffix3 ELSE '' END)
)
FROM Assets

convert certain column names with comma separated string from sql table with conditions

For example , I have this table with different column names and the Boolean value below it,
case1 case2 case3 case4
1 0 1 0
What I want to retrieve,only column names with 1 value. So, my desired results from the query should only be case1,case3
Desired Output : case1,case3
there is only one row fetch from sql query
Is there any way?

If I understand correctly, you could use a big case statement:
select stuff(( (case when case1 = 1 then ',case1' else '' end) +
(case when case2 = 1 then ',case2' else '' end) +
(case when case3 = 1 then ',case3' else '' end) +
(case when case4 = 1 then ',case4' else '' end)
), 1, 1, '') as columns

In the case you have multiple rows.
Query
select stuff((
(case when count(*) = sum(cast(case1 as int)) then ',case1' else '' end) +
(case when count(*) = sum(cast(case2 as int)) then ',case2' else '' end) +
(case when count(*) = sum(cast(case3 as int)) then ',case3' else '' end) +
(case when count(*) = sum(cast(case4 as int)) then ',case4' else '' end)), 1, 1, '')
as no_zero_columns
from your_table_name;
SQL Fiddle Demo

Is it possible to group string within a string in Teradata?

The original table (exactly the one that I am using .. with all commas brackets etc)
id attributes
1 123(red), 139(red), 123(white), 123(black), 139(white),
2 123(black), 139(white), 123(green),
32 223(blue), 223(red), 553(white), 123(black),
4 323(white), 139(red),
23 523(red),
I need to group the attribute numbers so that my table looks like
id attributes
1 123(red, white, black); 139(red, white);
2 123(black, green); 139(white);
32 223(blue, red); 553(white); 123(black);
4 323(white); 139(red);
23 523(red);
How can I do this?
Unfortunately i do not have access to stored procedures and functions like oreplace .. translate. I used to deal with Oracle previously and this is an easy task given one has access to stored procedures ... here i have no idea what to do

SQL is definitely not the right language to do string processing like that :-)
I used existing code to split/create comma-delimited strings, but in TD14 it would be much easier (there's strtok_split_to_table and udfConcat).
CREATE VOLATILE TABLE vt (id INT, attrib VARCHAR(100)) ON COMMIT PRESERVE ROWS;
INSERT INTO vt(1 ,'123(red), 139(red), 123(white), 123(black), 139(white),');
INSERT INTO vt(2 ,'123(black), 139(white), 123(green),');
INSERT INTO vt(32 ,'223(blue), 223(red), 553(white), 123(black),');
INSERT INTO vt(4 ,'323(white), 139(red), ');
INSERT INTO vt(23 ,'523(red),');
WITH RECURSIVE cte
(id,
len,
remaining,
word,
pos
) AS (
SELECT
id,
POSITION(',' IN attrib || ',') - 1 AS len,
SUBSTRING(attrib || ',' FROM len + 2) AS remaining,
TRIM(SUBSTRING(attrib FROM 1 FOR len)) AS word,
1
FROM vt
UNION ALL
SELECT
id,
POSITION(',' IN remaining)- 1 AS len_new,
SUBSTRING(remaining FROM len_new + 2),
TRIM(SUBSTRING(remaining FROM 1 FOR len_new)),
pos + 1
FROM cte
WHERE remaining <> ''
)
SELECT
id,
MAX(CASE WHEN newpos = 1 THEN newgrp ELSE '' END) ||
MAX(CASE WHEN newpos = 2 THEN newgrp ELSE '' END) ||
MAX(CASE WHEN newpos = 3 THEN newgrp ELSE '' END) ||
MAX(CASE WHEN newpos = 4 THEN newgrp ELSE '' END) ||
MAX(CASE WHEN newpos = 5 THEN newgrp ELSE '' END) ||
MAX(CASE WHEN newpos = 6 THEN newgrp ELSE '' END)
-- add as many CASEs as needed
FROM
(
SELECT
id,
ROW_NUMBER()
OVER (PARTITION BY id
ORDER BY newgrp) AS newpos,
a ||
MAX(CASE WHEN pos = 1 THEN '(' || b ELSE '' END) ||
MAX(CASE WHEN pos = 2 THEN ', ' || b ELSE '' END) ||
MAX(CASE WHEN pos = 3 THEN ', ' || b ELSE '' END) ||
MAX(CASE WHEN pos = 4 THEN ', ' || b ELSE '' END) ||
MAX(CASE WHEN pos = 5 THEN ', ' || b ELSE '' END) ||
MAX(CASE WHEN pos = 6 THEN ', ' || b ELSE '' END)
-- add as many CASEs as needed
|| '); ' AS newgrp
FROM
(
SELECT
id,
ROW_NUMBER()
OVER (PARTITION BY id, a
ORDER BY pos) AS pos,
SUBSTRING(word FROM 1 FOR POSITION('(' IN word) - 1) AS a,
TRIM(TRAILING ')' FROM SUBSTRING(word FROM POSITION('(' IN word) + 1)) AS b
FROM cte
WHERE word <> ''
) AS dt
GROUP BY id, a
) AS dt
GROUP BY id;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count unique integers in a string using hive? - sql

with tab1 as ( select stack(3, '1','1234567890', '2','1111111112', '3','2222222223') as (col0, col1)) select tab1.col0, count(distinct tf.col) from tab1 lateral view explode(split(tab1.col1,'')) tf as col where tf.col regexp '\\d' group by tab1.col0

Related

error converting % to datatype int in sql

SQL count string matches in each row

How to construct a new column based on other columns in a SELECT

convert certain column names with comma separated string from sql table with conditions

Is it possible to group string within a string in Teradata?

Categories

Resources