How to get malformed or string type data from a numeric column in hive? - hive

I have a column id (data type integer) containing the following records:
1
2
NULL
x
y
As hive automatically converts x and y into NULL, I'm first casting the id column to a string. Now I want count(id) where id is not from [0-9] and also not NULL. In my case, the count should be 2, but it is not working with xand y. I am also getting count of NULL's, in my example 3.
I have tried using LIKE, RLIKE and also with regexp_extract(id,'\&q=([^\&]+).
Can some one suggest me how to achieve this?

I tried something similar and it is working for me. I created an external table with your data:
CREATE EXTERNAL TABLE temp_count (count STRING) ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t' LOCATION 'user/$username/data'
Now I am running a query like this:
(Edited)
select count(*) from (select (count - count) as value from temp_count where count != 'NULL')q1 where value is NULL;
and I am getting 2 as the output.
Let me know if I am missing something here

Related

How to retrieve the required string in SQL having a variable length parameter

Here is my problem statement:
I have single column table having the data like as :
ROW-1>> 7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX
ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX
Here i want to separate these values from '-' and load into a new table. There are 11 segments in this string separated by '-', therefore, 11 columns. The problem is:
A. The length of these values are changing, however, i have to keep it as the length of these values in the standard format or the length which it has
e.g 7302- (should have four values, if the value less then that then keep that value eg. 73 then it should populate 73.
Therefore, i have to separate as well as mentation the integrity. The code which i am writing is :
select
SUBSTR(PROFILE_ID,1,(case when length(instr(PROFILE_ID,'-')<>4) THEN (instr(PROFILE_ID,'-') else SUBSTR(PROFILE_ID,1,4) end)
)AS [RQUIRED_COLUMN_NAME]
from [TABLE_NAME];
getting right parenthesis error
Please help.
I used the regex_substr SQL function to solve the above issue. Here below is an example:
select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,1);
Output is: 7302 --which is the 1st segment of the string
Similarly, the send string segment which is separated by "-" in the string can be obtained by just replacing the 1 with 2 in the above query at the end.
Example : select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,2);
output: 2210177000 which is the 2nd segment of the string

Need to find string using bigquery

We have below string column and having below data
and I want to find Null count present in string columns means how many times null value('') present in front of id column present in select statement
using big query.
Don't use string position.
Expected output:
count of null ('')id =3
1st row,2nd row and 5th row
Below is for BigQuery Standard SQL
#standardSQL
SELECT
FORMAT(
"count of null ('')id = %d. List of id is: %s",
COUNT(*),
STRING_AGG(CAST(ID AS STRING))
) AS output
FROM `project.dataset.table`
WHERE REGEXP_CONTAINS(String, r"(?i)''\s+(?:as|)\s+(?:id|\[id\])")
if to apply to sample data from your question - the output is
Row output
1 count of null ('')id = 3. List of id is: 1,2,5
The idea is to unify all strings to something you can query with like = "%''asid%" or regex
First replace all spaces with ''
replace "[", "]" with ''.
Make the use of " or ' consistent.
Then query with like.
For example:
select 1 from (select replace(replace(replace(replace('select "" as do, "" as [id] form table1',' ',''),'[',''),']',''),'"',"'") as tt)
where tt like ("%''asid%")
Its not a "smart" idea but its simple.
A better idea will be to save the query columns in a repeat column '"" as id' and the table in another column.
You don't need to save 'select' and 'from' this way you can query easily and also assemble a query from the data.
If I understand correctly, you want to count the number of appearances of '' in the string column.
If so, you can use regexp_extract_all():
select t.*,
(select count(*)
from unnest(regexp_extract_all(t.string, "''")) u
) as empty_string_count
from t;

SQL Server: How to select rows which contain value comprising of only one digit

I am trying to write a SQL query that only returns rows where a specific column (let's say 'amount' column) contains numbers comprising of only one digit, e.g. only '1's (1111111...) or only '2's (2222222...), etc.
In addition, 'amount' column contains numbers with decimal points as well and these kind of values should also be returned, e.g. 1111.11, 2222.22, etc
If you want to make the query generic that you don't have to specify each possible digit you could change the where to the following:
WHERE LEN(REPLACE(REPLACE(amount,LEFT(amount,1),''),'.','') = 0
This will always use the first digit as comparison for the rest of the string
If you are using SQL Server, then you can try this script:
SELECT *
FROM (
SELECT CAST(amount AS VARCHAR(30)) AS amount
FROM TableName
)t
WHERE LEN(REPLACE(REPLACE(amount,'1',''),'.','') = 0 OR
LEN(REPLACE(REPLACE(amount,'2',''),'.','') = 0
I tried like this in place of 1111111 replace with column name:
Select replace(Str(1111111, 12, 2),0,left(11111,1))

determine DB2 text string length

I am trying to find out how to write an SQL statement that will grab fields where the string is not 12 characters long. I only want to grab the string if they are 10 characters.
What function can do this in DB2?
I figured it would be something like this, but I can't find anything on it.
select * from table where not length(fieldName, 12)
From similar question DB2 - find and compare the lentgh of the value in a table field - add RTRIM since LENGTH will return length of column definition. This should be correct:
select * from table where length(RTRIM(fieldName))=10
UPDATE 27.5.2019: maybe on older db2 versions the LENGTH function returned the length of column definition. On db2 10.5 I have tried the function and it returns data length, not column definition length:
select fieldname
, length(fieldName) len_only
, length(RTRIM(fieldName)) len_rtrim
from (values (cast('1234567890 ' as varchar(30)) ))
as tab(fieldName)
FIELDNAME LEN_ONLY LEN_RTRIM
------------------------------ ----------- -----------
1234567890 12 10
One can test this by using this term:
where length(fieldName)!=length(rtrim(fieldName))
This will grab records with strings (in the fieldName column) that are 10 characters long:
select * from table where length(fieldName)=10
Mostly we write below statement
select * from table where length(ltrim(rtrim(field)))=10;

add record in SQL based on last character of a field

I have a field in sql that contains a 1 or 0 at the end. What I am trying to do is if the field has a 1 at the end but no corresponding 0 I would like to add that record.
3 different examples
Field value
Data1 ( I would like to add another record containing Data0)
Data0 ( I would like to add another record containing Data1)
Data0 and Data1 both exists in table ( Do nothing )
insert into test(col, another_column_1,...,another_column_n)
select substr(col,1, length(col)-1) || --this is the base value
max(mod(substr(col,length(col),length(col))+1,2)) --this is the termination (0 or 1)
as col ,
max(another_column_1),
...
max(another_column_n)
from test
where substr(col,length(col),length(col)) between '0' and '1'
group by substr(col,1, length(col)-1)
having count(*)=1;
you can see test here
Updated for Oracle
If I understood your question correctly, you can use the LIKE operator.
You can do something like:
(field like '%1' or field like '%0') and not field like '%0%1'
But generally, SQL is not suitable for text processing. This LIKE thing is pretty much the most advanced text processing feature in SQL.