Having an array struct in file like below
[{"A":"1","B":"2","C":"3"},{"A":"4","B":"5","C":"6"},{"A":"7","B":"8","C":"9"}]
How can I get the first & last value of column "A" ("1","7")
Need to write in Hive SQL.
Thanks in advance.
first element of array is array_name[0], last is array_name[size(array_name)-1].
Demo:
select example_data[0].A, example_data[size(example_data)-1].A
from
( --Your example data
select array(named_struct("A","1","B","2","C","3"),named_struct("A","4","B","5","C","6"),named_struct("A","7","B","8","C","9")) as example_data
)s;
OK
1 7
Time taken: 2.72 seconds, Fetched: 1 row(s)
Related
I have a table which has the following data
Ticketid created Details
205853669 2020-03-05 #CLOSE# Next action value://346004/ next action value://346002/ or value://346008/
205853670 2020-03-06 #Archive Next action value://346088/ next action value://346077/ or value://346057/
The string "value://" pattern is same in all column, I want to extract those numbers from the string.
ticketid Numbers
205853669 346004
205853669 346002
205853669 346008
205853670 346088
205853670 346077
205853670 346057
I am using standard Sql only
I have created something like below.
select ticketid,TRIM(REPLACE(SUBSTR(
details, STRPOS(details, "value//"),10
),"value//"","")) AS number from table
Below is for BigQuery Standard SQL
#standardSQL
SELECT Ticketid, Numbers
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(Details, r'value://(\d+)/')) Numbers
If to apply to sample data from your question - output is
Row Ticketid Numbers
1 205853669 346004
2 205853669 346002
3 205853669 346008
4 205853670 346088
5 205853670 346077
6 205853670 346057
The below query would work. This query splits the comment on value then extracts the 6 digit id.
with `project.dataset.table` as (
select id, split(details, 'value://') AS number from (
select '1' as id, '#CLOSE# Next action value://346004/ next action value://346002/ or value://346008/' as details
union all
select '2' as id, '#Archive Next action value://346088/ next action value://346077/ or value://346057/'
)
)
select id, regexp_extract(number1, "\\d{6}") as number
from `project.dataset.table` ,
UNNEST( number ) number1
where regexp_extract(number1, "\\d{6}") is not null
It has one remark about UNNEST function. As per documentation
The UNNEST operator takes an ARRAY and returns a table, with one row for each element in the ARRAY.
If you have only a few 'values://' for each comment then this wouldn't cause as much problem, but if there would be unlimited number of 'value://' this might become a performance bottleneck so keep that in mind. On the other hand this is the only way I know how to achieve that using CloudSQL.
I have a string in a text column. I want to extract the hashtag values from the string into a new table so that I can find the distinct count for each hashtag.
Example strings->
NeverTrump is never more. They were crushed last night in Cleveland at
Rules Committee by a vote of 87-12. MAKE AMERICA GREAT AGAIN!
CrookedHillary is outspending me by a combined 31 to 1 in Florida,
Ohio, & Pennsylvania. I haven't started yet!
CrookedHillary is not qualified!
MakeAmericaSafeAgain!#GOPConvention #RNCinCLE
MakeAmericaGreatAgain #ImWithYou
I am outlining the steps here as I'm not that good with the query, may update the answer once I get it right
Replace '#' in string by ' #'.
split each word in a string with space as delimiter.
use explode() lateral view functionality to get all the words of the string.
use a WHERE condition to fetch records starting with "#". LIKE '#%' condition should work.
then add the group by condition to get the counts of each hashtag.
This is what #lazilyInitialised said, I did a query with your data example:
with your_data as (--This is your data example, use your table instead of this CTE
select stack( 1,
1, --ID
" NeverTrump is never more. They were crushed last night in Cleveland at Rules Committee by a vote of 87-12. MAKE AMERICA GREAT AGAIN!
CrookedHillary is outspending me by a combined 31 to 1 in Florida, Ohio, & Pennsylvania. I haven't started yet!
CrookedHillary is not qualified!
MakeAmericaSafeAgain!#GOPConvention #RNCinCLE
MakeAmericaGreatAgain #ImWithYou
"
) as (id, str)
)
select id, word as hashtag
from
(
select id, word
from your_data d
lateral view outer explode(split(regexp_replace(d.str, '#',' #' ),'\\s')) l as word --replace hash w space+hash, split and explode words
)s
where word rlike '^#'
;
Result:
OK
id hashtag
1 #GOPConvention
1 #RNCinCLE
1 #ImWithYou
Time taken: 0.405 seconds, Fetched: 3 row(s)
I have a table of 811 records. I want to get five records at a time and assign it to variable. Next time when I run the foreach loop task in SSIS, it will loop another five records and overwrite the variable. I have tried doing with cursor but couldn't find the solution. Any help will be highly appreciated. I have table like this for e.g.
ServerId ServerName
1 Abc11
2 Cde22
3 Fgh33
4 Ijk44
5 Lmn55
6 Opq66
7 Rst77
. .
. .
. .
I want query should take first five names as follows and assign it to variable
ServerId ServerName
1 Abc11
2 Cde22
3 Fgh33
4 Ijk44
5 Lmn55
Then next loop takes another five name and overwrite the variable value and so on till the last record is consumed.
Taking ltn's answer into consideration this is how you can achieve limiting the rows in SSIS.
The Design will look like
Step 1 : Create the variables
Name DataType
Count int
Initial int
Final int
Step 2 : For the 1st Execute SQL Task write the sql to store the count
Select count(*) from YourTable
In the General tab of this task Select the ResultSet as Single Row.
In the ResultSet tab map the result to the variable
ResultName VariableName
0 User::Count
Step 3 : In the For Loop container enter the expression as shown below
Step 4 : Inside the For Loop drag an Execute SQL Task and write the expression
In Parameter Mapping map the initial variable
VariableName Direction DataType ParameterName ParameterSize
User::Initial Input NUMERIC 0 -1
Result Set tab
Result Name Variable Name
0 User::Final
Inside the DFT u can write the sqL to get the particular rows
Click on Parameters and select the variable INITIAL and FINAL
if your data will not be update between paging cycles and the sort order is always the same then you could try an approach similiar to:
CREATE PROCEDURE TEST
(
#StartNumber INT,
#TakeNumber INT
)
AS
SELECT TOP(#TakeNumber)
*
FROM(
SELECT
RowNumber=ROW_NUMBER() OVER(ORDER BY IDField DESC),
NameField
FROM
TableName
)AS X
WHERE RowNumber>=#StartNumber
I have a PLSQL code to fetch data from a table with 30 rows out of which 10 columns are with delimiter of length 3 and I need to convert 1 row into multiple rows based on number of fields in those 10 columns.
So I am loading all Data into 1 temp table , and on temp table I am calling a cursor which will split data and insert multiple rows into main table.
Inside cursor I am using regexp_substr to split the Value and regular expression used by me is [^\\|]+{3} , I am not getting actual Values after splitting.
Sample data used for test case is
100|||200||300|||400||||0
After splitting I should get values as below
100, 200||300 , 400 , |0
But what I am getting is
100 , 200, 300 ,400, 0
Can any one suggest me the proper way to do it?
Waiting for reply!
Thanks
Try this. Hope it helps.
SELECT REPLACE('100|||200||300|||400||||0','|||',',') OUTPUT FROM DUAL;
----------------------------OUTPUT---------------------------------------------
OUTPUT
100,200||300,400,|0
-----------------------------------------------------------------------------
Regex (\|*\d.*?)\|{3}|(\|*\d$) captures what you're after here.
I Have an SQL query giving me X results, I want the query output to have a coulmn called
count making the query somthing like this:
count id section
1 15 7
2 3 2
3 54 1
4 7 4
How can I make this happen?
So in your example, "count" is the derived sequence number? I don't see what pattern is used to determine the count must be 1 for id=15 and 2 for id=3.
count id section
1 15 7
2 3 2
3 54 1
4 7 4
If id contained unique values, and you order by id you could have this:
count id section
1 3 2
2 7 4
3 15 7
4 54 1
Looks to me like mikeY's DSum approach could work. Or you could use a different approach to a ranking query as Allen Browne described at this page
Edit: You could use DCount instead of DSum. I don't know how the speed would compare between the two, but DCount avoids creating a field in the table simply to store a 1 for each row.
DCount("*","YourTableName","id<=" & [id]) AS counter
Whether you go with DCount or DSum, the counter values can include duplicates if the id values are not unique. If id is a primary key, no worries.
I frankly don't understand what it is you want, but if all you want is a sequence number displayed on your form, you can use a control bound to the form's CurrentRecord property. A control with the ControlSource =CurrentRecord will have an always-accurate "record number" that is in sequence, and that will update when the form's Recordsource changes (which may or may not be desirable).
You can then use that number to navigate around the form, if you like.
But this may not be anything like what you're looking for -- I simply can't tell from the question you've posted and the "clarifications" in comments.
The only trick I have seen is if you have a sequential id field, you can create a new field in which the value for each record is 1. Then you do a running sum of that field.
Add to your query
DSum("[New field with 1 in it]","[Table Name]","[ID field]<=" & [ID Field])
as counterthing
That should produce a sequential count in Access which is what I think you want.
HTH.
(Stolen from Rob Mills here:
http://www.access-programmers.co.uk/forums/showthread.php?p=160386)
Alright, I guess this comes close enough to constitute an answer: the following link specifies two approaches: http://www.techrepublic.com/blog/microsoft-office/an-access-query-that-returns-every-nth-record/
The first approach assumes that you have an ID value and uses DCount (similar to #mikeY's solution).
The second approach assumes you're OK creating a VBA function that will run once for EACH record in the recordset, and will need to be manually reset (with some VBA) every time you want to run the count - because it uses a "static" value to run its counter.
As long as you have reasonable numbers (hundreds, not thousands) or records, the second approach looks like the easiest/most powerful to me.
This function can be called from each record if available from a module.
Example: incrementingCounterTimeFlaged(10,[anyField]) should provide your query rows an int incrementing from 0.
'provides incrementing int values 0 to n
'resets to 0 some seconds after first call
Function incrementingCounterTimeFlaged(resetAfterSeconds As Integer,anyfield as variant) As Integer
Static resetAt As Date
Static i As Integer
'if reset date < now() set the flag and return 0
If DateDiff("s", resetAt, Now()) > 0 Then
resetAt = DateAdd("s", resetAfterSeconds, Now())
i = 0
incrementingCounterTimeFlaged = i
'if reset date > now increments and returns
Else
i = i + 1
incrementingCounterTimeFlaged = i
End If
End Function
autoincrement in SQL
SELECT (Select COUNT(*) FROM table A where A.id<=b.id),B.id,B.Section FROM table AS B ORDER BY B.ID Asc
You can use ROW_NUMBER() which is in SQL Server 2008
SELECT ROW_NUMBER() OVER (ORDER By ID DESC) RowNum,
ID,
Section
FROM myTable
Then RowNum displays sequence of row numbers.