Given a table that has a column of string "timestamps" (yyyyMMddHHmmssSSS format), I want to substring the first 8 characters, and get a count of how many rows have that substring, grouping the results.
Sample data...
TIMESTAMP
20100802123456123
20100803123456123
20100803123456123
20100803123456123
20100804123456123
20100805123456123
20100805123456123
20100805123456123
20100805123456123
20100806123456123
20100807123456123
20100807123456123
...and expected results...
SUBSTRING, COUNT
20100802, 1
20100803, 3
20100804, 1
20100805, 4
20100806, 1
20100807, 2
I know this should be easy, but I'm not having any luck at the moment.
I don't have a database to test with, but it seems like you are looking for
select
substr(timestamp, 1, 8),
count(*)
from
my_table
group by
substr(timestamp, 1, 8);
Related
I am trying to count the number of MAX consecutive digits that appear in a string column, let me give an example to illustrate better what I am trying to do. If I have a table called email
email
lucas1234#gmail.com
fer12#gmail.com
lupal#gmail.com
carlos1perez222#gmail.com
carlos11perez222#gmail.com
lucila1#gmail.com
my expected output would be
email count_cons_digits
lucas1234#gmail.com 4
fer12#gmail.com 2
lupal#gmail.com 0
carlos1perez222#gmail.com 3
carlos11perez222#gmail.com 3
lucila1#gmail.com 1
Check that this question is very similar to :
Number of consecutive digits in a column string
but the only difference is that the function from the results is not contemplating cases with only one digit in the email (like lucila1#gmail.com). In this case, the expected result should be 1 but the proposed function is giving 0. And also whenever the email contains "two sections" of consecutive digits (carlos11perez222#gmail.com). In this case, the expected output is to be 3 but is given 5.
Consider below approach
select *,
ifnull((select length(digits) len
from unnest(regexp_extract_all(email, r'\d+')) digits
order by len desc
limit 1
), 0) as count_cons_digits
from your_table
if applied to sample data in your question - output is
You may also try this approach using regex:
WITH email AS
(SELECT 'lucas1234#gmail.com' mail,
UNION ALL SELECT 'fer12#gmail.com',
UNION ALL SELECT 'lupal#gmail.com',
UNION ALL SELECT 'carlos1perez222#gmail.com',
UNION ALL SELECT 'carlos11perez222#gmail.com',
UNION ALL SELECT 'lucila1#gmail.com')
SELECT email,
(LENGTH(REGEXP_REPLACE(REGEXP_REPLACE(email.mail, r'[A-Za-z]+\d+[A-Za-z]+', ''),r'[A-Za-z.#]+',''))) AS count_cons_digits,
FROM email;
Output:
Is there any way to split values based on consecutive 0's in presto.Minimum 6 digits should be there in first split, if digit count is less than 6 than need to consider some 0's as digit then split if digit count is >= 6 then just need to split in 2 groups.
below query is working as expected in Hive.But I am not able to do the same using presto.
select low as orginal_Value,
split(regexp_replace(low,'(\\d{6,}?)(0+)$','$1|$2'),'\\|') Output_Value from test;
Presto Query:
presto> SELECT regexp_split('1234567890000', '(\d{6,}?)(0+)$') as output;
output
[1234567890000]
(1 row)
It worked Now.
select split(regexp_replace('1234567890000','(\d{6,}?)(0+)$','$1|$2'), '|') as output;
enter code here
output
-------------------
[123456789, 0000]
I have 2 row data which I want to make it to be 2 column,
I tried union syntax but it didn't work.
Here is the data I have:
breed 1 breed2
I tried to convert it with this sql
select a.breed union a.breed
but it didn't work.
Here is what you want from the SQL:
breed1,breed2
SELECT
[breed1],
[breed2]]
FROM
(
SELECT 'breed1' myColumn
union
select 'breed2'
) AS SourceTable
PIVOT
(
AVG(mySecondColumn) FOR
myColumn IN ([breed1], [breed2]])
) AS PivotTable;
You can use a self join. This needs a way to pair rows together (so if you have four rows you get 1 and 2 in one result and 3 and 4 in the other rather than another combination).
I'm going to assume you have sequentially numbered rows in an Id column and an odd numbered row is paired with the one greater even Id:
select odd.Data as 'First', even.Data as 'Second'
from TheData odd
inner join TheData even on odd.Id+1 = even.Id
where odd.Id % 2 = 1;
More generally for more columns use of pivot is more flexible.
How about an aggregation query?
select min(breed) as breed1, max(breed) as breed2
from t;
I am using Oracle database.
In my table t_mytable, I have one field myfield and this field has string values like 00101110010.
I need to count the rows which has ""4th digit of myfield value is 1".
For instance,
myfield
-------
00101110010
00111110010
00101101010
00101110010
00111111110
For above data, count should be 2 because 2 rows has fourth bit as 1 (I started from 1 not 0 while determining first digit).
How can I do this in sql?
if myfield is a string you can use substr for extract the fourth char
select count(*)
from t_mytable
where substr(myfield, 4,1) ='1';
I have a table T1 with below data
Sno Ns_NAME Mode stat
1 AF_rtf_Nd_1 Manual 2
2 AF_rtf_Nd_2 Manual 3
3 AF_rtf_Nd_2i Manual 2
4 AF_rtf_Nd_3 Auto 2
5 AF_rtf_Nd_3i Auto 3
I need to perform below,
check if it is manual, fetch from Ns_NAME upto last "_" and check for duplicates. In this case there is 1 duplicate. Obtain average stat [(2+3)/2] of those two rows and pump into another table T2.
Output:
T2
AF_rtf_Nd Manual 2.5
I tried using substr function and used etract . But it is not fetching the correct result.
In order to look at 'Manual' records only, use a WHERE clause. Then aggregate over the substring. You get the substring with INSTR plus SUBSTR or with REGEX_REPLACE. Then only keep duplicates by using HAVING COUNT(*) > 1.
insert into t2
select
min(sno),
regexp_replace(ns_name, '_[^_]*$', ''),
'Manual',
avg(stat)
from t1
where mode = 'Manual'
group by regexp_replace(ns_name, '_[^_]*$', '')
having count(*) > 1;
The equivalent to the REGEXP_REPLACE with INSTR and SUBSTR is
substr(ns_name, 1, instr(ns_name, '_', -1) - 1)
(Only difference is when there is no '_' in the string at all.)