summarize word count for a timeframe - google-bigquery

I have the below table which stores the response text and the keyword search associated with it.
create table nlp.search(response string, words string,inquiry_time timestamp);
insert into nlp.search values('how to reset password','reset word password',TIMESTAMP ("2021-09-19 05:30:00+00"));
insert into nlp.search values('how to reset password','reset passphrase',TIMESTAMP ("2021-09-20 07:30:00+00"));
insert into nlp.search values('how to reset password','password',TIMESTAMP ("2021-09-16 08:30:00+00"));
insert into nlp.search values('how to reset password','reset',TIMESTAMP ("2021-09-14 08:30:00+00"));
I want to provide a summary report in this format
response and the count of each individual words associated with it.
response individual_word_count
how to reset password reset(3) word(1) password(2) passphrase(1)
also the timestamp column inquiry_time can be passed to narrow down the date range and the summary values must be computed accordingly
e.g for timeframe filter 2021-09-19 till 2021-09-20
response individual_word_count
how to reset password reset(2) word(1) password(1) passphrase(1)
can this be accomplished using a view?

Use below
select response, word, count(1) ndividual_word_count
from `nlp.search`,
unnest(split(words, ' ')) word
where date(inquiry_time) between '2021-09-19' and '2021-09-20'
group by response, word
if applied to sample data in your question - output is
I Need to display the word and count in 1 single column
use below then
select response,
string_agg(format('%s (%i)', word, individual_word_count)) counts
from (
select response, word, count(1) individual_word_count
from `nlp.search`,
unnest(split(words, ' ')) word
where date(inquiry_time) between '2021-09-19' and '2021-09-20'
group by response, word
)
group by response
with output

Related

What is the appropriate hive query?

This is my case study
Identify which numbers made calls as well as made SMS messages, where a number of calls made should be more than 10 and a number of messages should be more than 5.
I have written the query:
select call.user_no from (select user_no, count(other_no) as total_calls
from calls group by user_no having total_calls > 10) as call
,(select user_no, count(other_no) as total_msgs
from messages group by user_no having total_msgs > 5) as msgs where
call.user_no = msgs.user_no;
These are the tables I have created
create table calls (user_no string,other_no string,direction string,duration smallint,call_timestamp string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
tblproperties("skip.header.line.count"="1");
create table messages (user_no string,other_no string,direction string,msg_len string,call_timestamp string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
tblproperties("skip.header.line.count"="1");
The output shows no user.
What is the error?

find diffrences between 2 tables sql and how can i get the changed value?

i have this query
insert into changes (id_registro)
select d2.id_registro
from daily2 d2
where exists (
select 1
from daily d1
where
d1.id_registro = d2.id_registro
and (d2.origen, d2.sector, d2.entidad_um, d2.sexo, d2.entidad_nac, d2.entidad_res,
d2.municipio_res, d2.tipo_paciente,d2.fecha_ingreso, d2.fecha_sintomas,
d2.fecha_def, d2.intubado, d2.neumonia, d2.edad, d2.nacionalidad, d2.embarazo,
d2.habla_lengua_indig, d2.diabetes, d2.epoc, d2.asma, d2.inmusupr, d2.hipertension,
d2.otra_com, d2.cardiovascular, d2.obesidad,
d2.renal_cronica, d2.tabaquismo, d2.otro_caso, d2.resultado, d2.migrante,
d2.pais_nacionalidad, d2.pais_origen, d2.uci )
<>
(d1.origen, d1.sector, d1.entidad_um, d1.sexo, d1.entidad_nac, d1.entidad_res,
d1.municipio_res, d1.tipo_paciente, d1.fecha_ingreso, d1.fecha_sintomas,
d1.fecha_def, d1.intubado, d1.neumonia, d1.edad, d1.nacionalidad, d1.embarazo,
d1.habla_lengua_indig, d1.diabetes, d1.epoc, d1.asma, d1.inmusupr, d1.hipertension,
d1.otra_com, d1.cardiovascular, d1.obesidad,
d1.renal_cronica, d1.tabaquismo, d1.otro_caso, d1.resultado, d1.migrante,
d1.pais_nacionalidad, d1.pais_origen, d1.uci ))
it results in an insersion data that doesn't exist in another table, that's fine. but i want know exactly which field has changed to store it in a log table
You don't mention precisely what you expect to see in your output but basically to accomplish what you're after you'll need a long sequence of CASE clauses, one for each column
e.g. one approach might be to create a comma-separated list of the column names that have changed:
INSERT INTO changes (id_registro, column_diffs)
SELECT d2.id_registro,
CONCAT(
CASE WHEN d1.origen <> d2.origen THEN 'Origen,' ELSE '' END,
CASE WHEN d1.sector <> d2.sector THEN 'Sector,' ELSE '' END,
etc.
Within the THEN part of the CASE you can build whatever detail you want to show
e.g. a string showing before and after values of the columns CONCAT('Origen: Was==> ', d1.origen, ' Now==>', d2.origen). Presumably though you'll also need to record the times of these changes if there can be multiple updates to the same record throughout the day.
Essentially you'll need to decide what information you want to show in your logfile, but based on your example query you should have all the information you need.

OPENJSON - modify statement to ignore first part of the string

We receive auto-generated emails from an application, and we export those to our database as they arrive at the Inbox. The table is called dbo.MailArchive.
Up until recently, the body of the email has always looked like this...
Status: Completed
Successful actions count: 250
Page load count: 250
...except with different numbers and statuses. Note that there is a carriage return on the blank line after Page load count.
The entirety of this data gets written to a field called Mail_Body - then we run the following statement using OPENJSON to parse those lines into their own columns in the record:
DECLARE #PI varchar(7) = '%[^' + CHAR(13) + CHAR(10) + ']%';
SELECT j.Status,
j.Successful_Actions_Count,
j.Page_Load_Count
FROM dbo.MailArchive m
CROSS APPLY(VALUES(REVERSE(m.Mail_Body),PATINDEX(#PI,REVERSE(m.Mail_Body)))) PI(SY,I)
CROSS APPLY(VALUES(REVERSE(STUFF(PI.SY,1,PI.I,''))))S(FixedString)
CROSS APPLY OPENJSON (CONCAT('{"', REPLACE(REPLACE(S.FixedString, ': ', '":"'), CHAR(13) + CHAR(10), '","'), '"}'))
WITH (Status varchar(100) '$.Status',
Successful_Actions_Count int '$."Successful actions count"',
Page_Load_Count int '$."Page load count"') j;
Beginning today, there are certain emails where the body of the email looks like this:
Agent did not meet defined success criteria on this run.
Status: Completed
Successful actions count: 250
Page load count: 250
To clarify, that's one new line at the top, a carriage return at the end of that line, and a carriage return on the blank line between the new line and the Status line. At this time, there is no consistent way to predict which emails will come in with this new line, and which ones won't.
How can I modify our OPENJSON statement to say, If this first line exists in the body, skip/ignore it and parse lines 3 through 5, else just do exactly what I have above? Or perhaps even better to future-proof it, always ignore everything before the word Status?
Since your data has new leading and trailing rows, I think a simple aggregation in concert with a string_split() and a CROSS APPLY would be more effective than my previous XML answer and the current JSON approach
Example or dbFiddle
Select A.ID
,Status = stuff(Pos1,1,charindex(':',Pos1),'')
,Action = try_convert(int,stuff(Pos2,1,charindex(':',Pos2),''))
,PageCnt = try_convert(int,stuff(Pos3,1,charindex(':',Pos3),''))
From YourTable A
Cross Apply (
Select [Pos1] = max(case when Value like 'Status:%' then value end)
,[Pos2] = max(case when Value like '%actions count:%' then value end)
,[Pos3] = max(case when Value like 'Page load count:%' then value end)
From string_split(SomeCol,char(10))
) B
Returns
ID Status Action PageCnt
1 Completed 250 250
Note: Use an OUTER APPLY if you want to see NULLs

APEX 5: How can I store the results of a MultiSelection from a List in an APEX-Item?

My example list:
How can I store the results in the Input Field "Your Choose" as comma-separate values (in this example: ALABAMA, ARIZONA).
Would APEX_UTIL.TABLE_TO_STRING be suitable?
You need to be more specific while asking questions.
I assume that you have page with
item PX_MULTISELECT.
process Before Header that loads data from DB to the input PX_MULTISELECT
And basing on this I assume you are asking how to get data separated with : from DB.
In process that loads data you should fetch data from db using listagg function
begin
select
listagg(COLUMN, ':') within group (order by 1)
into
:PX_MULTISELECT
from
...
where
...
end;
LISTAGG documentation

Logical operator sql query going weirdly wrong

I am trying to fetch results from my sqlite database by providing a date range.
I have been able to fetch results by providing 3 filters
1. Name (textfield1)
2. From (date)(textfield2)
3. To (date)(textfield3)
I am inserting these field values taken from form into a table temp using following code
Statement statement6 = db.createStatement("INSERT INTO Temp(date,amount_bill,narration) select date,amount,narration from Bills where name=\'"+TextField1.getText()+"\' AND substr(date,7)||substr(date,4,2)||substr(date,1,2) <= substr (\'"+TextField3.getText()+"\',7)||substr (\'"+TextField3.getText()+"\',4,2)||substr (\'"+TextField3.getText()+"\',1,2) AND substr(date,7)||substr(date,4,2)||substr(date,1,2) >= substr (\'"+TextField2.getText()+"\',7)||substr (\'"+TextField2.getText()+"\',4,2)||substr (\'"+TextField2.getText()+"\',1,2) ");
statement6.prepare();
statement6.execute();
statement6.close();
Now if i enter the following input in my form for the above filters
1.Ricky
2.01/02/2012
3.28/02/2012
It fetches date between these date ranges perfectly.
But now i want to insert values that are below and above these 2 date ranges provided.
I have tried using this code.But it doesnt show up any result.I simply cant figure where the error is
The below code is to find entries having date lesser than 01/02/2012 and greater than 28/02/2012.
Statement statementVII = db.createStatement("INSERT INTO Temp5(date,amount_rec,narration) select date,amount,narration from Bills where name=\'"+TextField1.getText()+"\' AND substr(date,7)||substr(date,4,2)||substr(date,1,2) < substr (\'"+TextField2.getText()+"\',7)||substr (\'"+TextField2.getText()+"\',4,2)||substr (\'"+TextField2.getText()+"\',1,2) AND substr(date,7)||substr(date,4,2)||substr(date,1,2) > substr (\'"+TextField3.getText()+"\',7)||substr (\'"+TextField3.getText()+"\',4,2)||substr (\'"+TextField3.getText()+"\',1,2)");
statementVII.prepare();
statementVII.execute();
statementVII.close();
Anyone sound on this,please guide.Thanks.
you need to use an Or clause together with brackets:
WHERE name='....' AND (yourDateField<yourLowerDate OR yourDateField>yourHigherDate)