Need to find string using bigquery - sql

We have below string column and having below data
and I want to find Null count present in string columns means how many times null value('') present in front of id column present in select statement
using big query.
Don't use string position.
Expected output:
count of null ('')id =3
1st row,2nd row and 5th row

Below is for BigQuery Standard SQL
#standardSQL
SELECT
FORMAT(
"count of null ('')id = %d. List of id is: %s",
COUNT(*),
STRING_AGG(CAST(ID AS STRING))
) AS output
FROM `project.dataset.table`
WHERE REGEXP_CONTAINS(String, r"(?i)''\s+(?:as|)\s+(?:id|\[id\])")
if to apply to sample data from your question - the output is
Row output
1 count of null ('')id = 3. List of id is: 1,2,5

The idea is to unify all strings to something you can query with like = "%''asid%" or regex
First replace all spaces with ''
replace "[", "]" with ''.
Make the use of " or ' consistent.
Then query with like.
For example:
select 1 from (select replace(replace(replace(replace('select "" as do, "" as [id] form table1',' ',''),'[',''),']',''),'"',"'") as tt)
where tt like ("%''asid%")
Its not a "smart" idea but its simple.
A better idea will be to save the query columns in a repeat column '"" as id' and the table in another column.
You don't need to save 'select' and 'from' this way you can query easily and also assemble a query from the data.

If I understand correctly, you want to count the number of appearances of '' in the string column.
If so, you can use regexp_extract_all():
select t.*,
(select count(*)
from unnest(regexp_extract_all(t.string, "''")) u
) as empty_string_count
from t;

Related

How to get tally of unique words using only SQL?

How do you get a list of all unique words and their frequencies ("tally") in one text column of one table of an SQL database? An answer for any SQL dialect in which this is possible would be appreciated.
In case it helps, here's a one-liner that does it in Ruby using Sequel:
Hash[DB[:table].all.map{|r| r[:text]}.join("\n").gsub(/[\(\),]+/,'').downcase.strip.split(/\s+/).tally.sort_by{|_k,v| v}.reverse]
To give an example, for table table with 4 rows with each holding one line of Dr. Dre's Still D.R.E.'s refrain in a text field, the output would be:
{"the"=>4,
"still"=>3,
"them"=>3,
"for"=>2,
"d-r-e"=>1,
"it's"=>1,
"streets"=>1,
"love"=>1,
"got"=>1,
"i"=>1,
"and"=>1,
"beat"=>1,
"perfect"=>1,
"to"=>1,
"time"=>1,
"my"=>1,
"taking"=>1,
"girl"=>1,
"low-lows"=>1,
"in"=>1,
"corners"=>1,
"hitting"=>1,
"world"=>1,
"across"=>1,
"all"=>1,
"gangstas"=>1,
"representing"=>1,
"i'm"=>1}
This works, of course, but is naturally not as fast or elegant as it would be to do it in pure SQL - which I have no clue if that's even in the realm of possibilites...
I guess it would depend on how the SQL database would look like. You would have to first turn your 4 row "database" into a data of single column, each row representing one word. To do that you could use something like String_split, where every space would be a delimiter.
STRING_SPLIT('I'm representing for them gangstas all across the world', ' ')
https://www.sqlservertutorial.net/sql-server-string-functions/sql-server-string_split-function/
This would turn it into a table where every word is a row.
Once you've set up your data table, then it's easy.
Your_table:
[Word]
I'm
representing
for
them
...
world
Then you can just write:
SELECT Word, count(*)
FROM your_table
GROUP BY Word;
Your output would be:
Word | Count
I'm 1
representing 1
I had a play using XML in sql server. Just an idea :)
with cteXML as
(
select *
,cast('<wd>' + replace(songline,' ' ,'</wd><wd>') + '</wd>' as xml) as XMLsongline
from #tSongs
),cteBase as
(
select p.value('.','nvarchar(max)') as singleword
from cteXML as x
cross apply x.XMLsongline.nodes('/wd') t(p)
)
select b.singleword,count(b.singleword)
from cteBase as b
group by b.singleword

How can I compare two columns for similarity in SQL Server?

I have one column that called 'message' and includes several data such as fund_no, detail, keywords. This column is in table called 'trackemails'.
I have another table, called 'sendemails' that has a column called 'Fund_no'.
I want to retrieve all data from 'trackemail' table that the column 'message' contains characters same as 'Fund_no' in 'trackemails' Table.
I think If I want to check the equality, I would write this code:
select
case when t.message=ts.fund_no then 1 else 0 end
from trackemails t, sendemails s
But, I do want something like below code:
select
case when t.message LIKE ts.fund_no then 1 else 0 end
from trackemails t, sendemails s
I would be appreciate any advice to how to do this:
SELECT *
FROM trackemails tr
INNER JOIN sendemail se on tr.Message like '%' + se.Fund_No + '%'
Dear Check SQL CHARINDEX() Function. This function finds a string in another string and returns int for the position they match. Like
SELECT CHARINDEX('ha','Elham')
-- Returns: 3
And as you need:
SELECT *
,(SELECT *
FROM sendemail
WHERE CHARINDEX(trackemails.Message,sendemail.Fund_No)>0 )
FROM trackemails
For more information, If you want something much better for greater purposes, you can use Fuzzy Lookup Component in SSDT SSIS. This Component gives you a new column in the output which shows the Percentages of similarity of two values in two columns.

MS Access Expression to Find Records Where the Value to the Left of a Comma is Greater than Zero

I have some records in a column that contain IDs and some of these records contain multiple IDs separated by commas. Additionally there are some records where I have ",3" and ",2" when they should simply be "3" and "2". I do not have write privileges in this DB so updating those records is not an option.
I am trying to write a query that returns records that have a comma where the value to the left of any comma in the record is greater than 0 e.g. "2,3", "2,3,12" etc but NOT ",3" or ",2".
What would this expression look like in MS Access?
Thanks in advance.
If you want to remove the starting comma from the records when you return them, you can do so using a simple query:
SELECT IIF(MyField LIKE ",*", Right(MyField, Len(MyField)-1), MyField)
FROM MyTable
To answer your original question, you could simply use Val:
SELECT * FROM YourTable WHERE Val([YourField]) > 0
I would simply use:
select t.*
from t
where val not like ",*";
This doesn't handle the 0 part, but you don't give any examples in your answer. Perhaps this answers that part:
select t.*
from t
where val not like ",*" and val not like "*0,*";

SQL Server: How to select rows which contain value comprising of only one digit

I am trying to write a SQL query that only returns rows where a specific column (let's say 'amount' column) contains numbers comprising of only one digit, e.g. only '1's (1111111...) or only '2's (2222222...), etc.
In addition, 'amount' column contains numbers with decimal points as well and these kind of values should also be returned, e.g. 1111.11, 2222.22, etc
If you want to make the query generic that you don't have to specify each possible digit you could change the where to the following:
WHERE LEN(REPLACE(REPLACE(amount,LEFT(amount,1),''),'.','') = 0
This will always use the first digit as comparison for the rest of the string
If you are using SQL Server, then you can try this script:
SELECT *
FROM (
SELECT CAST(amount AS VARCHAR(30)) AS amount
FROM TableName
)t
WHERE LEN(REPLACE(REPLACE(amount,'1',''),'.','') = 0 OR
LEN(REPLACE(REPLACE(amount,'2',''),'.','') = 0
I tried like this in place of 1111111 replace with column name:
Select replace(Str(1111111, 12, 2),0,left(11111,1))

add record in SQL based on last character of a field

I have a field in sql that contains a 1 or 0 at the end. What I am trying to do is if the field has a 1 at the end but no corresponding 0 I would like to add that record.
3 different examples
Field value
Data1 ( I would like to add another record containing Data0)
Data0 ( I would like to add another record containing Data1)
Data0 and Data1 both exists in table ( Do nothing )
insert into test(col, another_column_1,...,another_column_n)
select substr(col,1, length(col)-1) || --this is the base value
max(mod(substr(col,length(col),length(col))+1,2)) --this is the termination (0 or 1)
as col ,
max(another_column_1),
...
max(another_column_n)
from test
where substr(col,length(col),length(col)) between '0' and '1'
group by substr(col,1, length(col)-1)
having count(*)=1;
you can see test here
Updated for Oracle
If I understood your question correctly, you can use the LIKE operator.
You can do something like:
(field like '%1' or field like '%0') and not field like '%0%1'
But generally, SQL is not suitable for text processing. This LIKE thing is pretty much the most advanced text processing feature in SQL.