I have a month column with values from 1,2,3 up to 12. I am writing below query to convert column values with 1 digit to 2 digits that is values like 1 and 2 will be converted to 01 and 02, but that concatenation is not working, the month still remains as single digit.
Main query:
select
case
when len(month) = 1
then concat(0, month)
else month
end as month_new,
month
from
Table
But when I tried the query separately as below the concatenation works and it converts single digit month to 2 digits
Query 1
select top 10 concat(0, month), month
from table
Query 1 alone is working
Query 2
select
case
when len(month) = 1
then 1
else 0
end,
month
from
Table
Query 2 alone is working, means the checking of length in column month is working as expected. But when concat used inside case it is not working.
I have modified the query as below and worked for me
select
case
when len(month) = 1
then concat(0, month)
else cast(month as varchar)
end as month_new,
month
from
table
The problem is that month is an integer, whereas the result from concat() is a string. So. case is trying to cast the string back into an integer. You could force the integer into a string by using cast, but there are better ways to do this.
Instead, just use the FORMAT function:
select
format(month, '00') as month_new
, month
from viivscaazure.F_SALES_DETAIL
Don't know what database are you using and since you don't provide any sample data I only can assume that your CASE is not the problem, but if you want to do so that means your datatype is string and you tried to CONCAT string with integer in your query.
Maybe you can try to add "quote" to your zero string and CAST the result as a string.
Related
Two columns in table looks like this:
Year of birth
ID
2005
-
1997
-
85
-
95...
How do I create a SQL SELECT from all the data that will return the age of each person based only on the year of birth, and if the whole is not given or only the ID is given, then:
-if only two digits of the year are given such as 85 then by default the year of birth is 1985
-if no year is given then on the basis of the ID whose first two digits are the year of birth as above i.e. ID 95...- first two digits are 95 so the year of birth is 1995
MySQL
A simple example of using MySQL CASE function:
SELECT
CASE
WHEN year_of_birth REGEXP '^[0-9]{4}$' THEN year_of_birth
WHEN year_of_birth REGEXP '^[0-9]{2}$' THEN CONCAT("19", year_of_birth)
ELSE CONCAT("19", ID)
END as year_of_birth
FROM Accounts;
First, check for 4 digit year_of_birth, if not found, check for 2 digit, if not found then get ID. Using CONCAT function to prepend "19" to the 2 digit year and 2 digit ID. Also using REGEXP to check for 4 or 2 digit years.
Try it here: https://onecompiler.com/mysql/3y6yc7mv2
Firstly, I would suggest structuring your database in a cleaner way. Having some years formatted as four digits (e. g. 1985), and others as two is confusing and causes issues such as the one you have run into.
That being said, here is an ad-hoc transact sql formula that will calculate the age based on the incomplete data.
IF 'Year of Birth' IS NULL
SELECT YEAR(NOW()) - (1900 + CAST(LEFT('ID',2) AS INT));
ELSE
IF 'Year of Birth' < 100
SELECT YEAR(NOW()) - (1900 + 'Year of Birth');
ELSE
SELECT YEAR(NOW()) - 'Year of Birth'
This code is untested, and I assumed that the ID column is a string. You'll likely have to make adjustments to make it actually work for your database
To fix the structure of your table, however, a better approach might be cleaning the data and then calculating the date, using the following commands
Filling in null year values:
UPDATE table_name
SET 'Year of Birth' = CAST(LEFT('ID',2) AS INT)
WHERE IS_NULL('Year of Birth')
Making all year values 4 digits long:
UPDATE table_name
SET 'Year of Birth' = 1900 + 'Year of Birth'
WHERE 'Year of Birth' < 100
Now, you can simply subtract the current year from the 'Year of Birth' Column to calculate the age.
Good Luck!
Here is some relevant documentation
If-Else in SQL
Year Function in SQL
String Slicing in SQL
Casting Strings to Integers in SQL
You can follow these steps:
filter out all null values (using the WHERE clause and the COALESCE function)
transform each number to a valid year
year of birth has length 2 > map it to a value smaller than the current year (e.g. 22 -> 2022, 23 -> 1993)
year of birth has length 4 > skip
cast the year of birth string to a number
compute the difference between current year and retrieved year
Here's the full query:
WITH cte AS (
SELECT COALESCE(yob, ID) AS yob
FROM tab
WHERE NOT (yob IS NULL AND ID IS NULL)
)
SELECT yob,
YEAR(NOW()) -
CASE WHEN LENGTH(yob) = 2
THEN IF(CONCAT('20',yob) > YEAR(NOW()),
CONCAT('19',yob),
CONCAT('20',yob) )
WHEN LENGTH(yob) = 1
THEN CONCAT('200', yob)
ELSE yob
END +0 AS age
FROM cte
Check the demo here.
Lots of opportunities to clean up what you started with, and lots of open questions too, but the code below should get you started.
drop table if exists #x
create table #x (YearOfBirth nvarchar(4), ID nvarchar(50))
insert into #x values
('2005', NULL),
('1997', NULL),
('85', NULL),
(NULL, '951234567890')
select
year(getdate()) -
case when len(isnull(YearOfBirth, '')) <> 4
then year(convert(date, '01/01/' +
case when YearOfBirth is NULL
then left(ID, 2)
else YearOfBirth end))
else YearOfBirth end
as PossibleAge
from #x
where (isnumeric(YearOfBirth) <> 0 and len(YearOfBirth) in (2, 4))
or (YearOfBirth is NULL and isnumeric(ID) <> 0)
One and three digit years will be ignored. Lots of ways to adjust this, but without knowing data types, etc. it's just meant to be a rough start.
needing some advice on splitting a number into a date timestamp, currently using Hue to query the hive db;
In a table I have a column that is used to capture a unique ref for a record. The value looks like this;
219872021081000741
Contained within this is a date and time, I'm looking to extract (using sql) the date/time from this and have it as a column of its own. Here is the breakdown of the number:
Based on the bold values from left to right is DD YYYY MM HHMM
21 987 2021 08 1000 741
regex
[0-3]?[0-9]{1}$ref[2][0-9][0-9][0-9][0-1][0-9][0-2][0-9][0-5][0-9][0-9]{3}_"
Using sql, I want to assess the number then create a column that then formats it to DD-MM-YY HHMM as timestamp. Have reviewed some posts, and trying out a few things, but not having much luck. The other sticking point is the DD will not always be 2 values eg, if it was the 1st then it will be 1 not 01.
Trying to incorporate into the below. Thanks in advance for any advice.
select *,
cast((UTC +(60*60*12)*1000)/1000 as TIMESTAMP) as `LocalTime`
from Table.Name
where
name rlike 'FieldValue.*'
UPDATE: In a roundabout way I updated the sql to do a count of the value.
If it has 17 digits, then i know the day is anywhere from the 1st-9th
so I tag it as 17.
If it has 18 digits, then I know the day is anywhere from the 10h-endofmonth
From here i use substring to return the day components, which I'll bring into a single field via concat or something along those lines.**
Here is the update sql, just need to figure out/get some guidance on how I now determine how to use the new column FieldCount eg it is 17, then substring(FieldValue ,1,1) given its anything from the 1st-9th. If its 18, then substring(FieldValue ,1,2) given its anything from the 10th up.
select *,
cast((utc+(60*60*12)*1000)/1000 as TIMESTAMP) as `LocalTime`,
case
when FieldValue REGEXP '^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$' then '17'
when FieldValue REGEXP '^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$' then '17'
end FieldCount,
substring(FieldValue ,6,4) as Years,
substring(FieldValue ,1,1) as Days,
substring(FieldValue ,10,2) as Months,
substring(FieldValue ,12,2) as Hours,
substring(FieldValue ,14,2) as Minut
from table.name
New Update, I changed this now to separate based on case condition. This basically separates out the value into separate fields. Any ideas to concat based on alias field names?
select
AField,
cast((UTC+(60*60*12)*1000)/1000 as TIMESTAMP) as `LocalTime`,
case when length(AField) = 18 then substring(AField,1,2) else substring(AField,1,1) end Days,
case when length(AField) = 18 then substring(AField,10,2) else substring(AField,9,2) end Months,
case when length(AField) = 18 then substring(AField,6,4) else substring(AField,5,4) end years,
case when length(AField) = 18 then substring(AField,12,2) else substring(caseid,11,2) end Hours,
case when length(AField) = 18 then substring(AField,14,2) else substring(AField,13,2) end minutes
from table.name
Correct timestamp string representation in Hive is yyyy-MM-dd HH:mm:ss.S.
You do not need to extract all parts separately, then concat to get timestamps. Using regexp_replace you can build correct timestamp using backreferences to capturing groups (in round brackets) in the regexp.
with mytable as(--test dataset, use your table instead
select stack(2,
'219872021081000741',
'19872021081000741'
) as AField
)
select
case when length(AField) = 18
then timestamp(regexp_replace(AField,'^(\\d{2})\\d{3}(\\d{4})(\\d{2})(\\d{2})(\\d{2})\\d{3}$','$2-$3-$1 $4:$5:00.0'))
else timestamp(regexp_replace(AField,'^(\\d)\\d{3}(\\d{4})(\\d{2})(\\d{2})(\\d{2})\\d{3}$','$2-$3-0$1 $4:$5:00.0'))
end as result
from mytable
Result:
result
2021-08-21 10:00:00.0
2021-08-01 10:00:00.0
Note: timestamp() construct here is to demonstrate that string produced is compatible with timestamp data type and is being cast correctly, you can keep it as string if you prefer.
I have a column due date in the format 20210701 (YYYYMMDD), using SQL I want to extract all the dates apart from 5th of particular month ( As highlighted in the pic below )
I used the below code:
SELECT Due_Date_Key
FROM Table
WHERE Due_Date_Key <>20210705
However the error in the above code is it will exclude only the month of jul but not for other months.
How can extract the dates apart from 5th from the entire column.
Help would be much appreciated.
Note that column DUE_DATE_KEY is numeric.
A more SQLish way would be to convert string to date and then check if day is not 5
SELECT * FROM Table
WHERE DATE_PART('day', to_date(cast(DUE_DATE_KEY as varchar), 'YYYYMMDD')) != 5
Using modulo operator to determine whether the last two digits of DUE_DATE_KEY are 05.
select * from T where DUE_DATE_KEY % 100 <> 5
Using your sample data, the above query returns the following:
due_date_key
20210701
20210708
20210903
Refer to this db fiddle
I am new to BQ and struggling with the below:
I want to find out no. of days between two timestamp
entered in different format.
ex: Date_Col1 is 2015/04/13 12:40:44.000 and
Date_Col2 is entered as 4/30/2015 17:35
I tried changing format using date(timestamp(4/30/2015 17:35)) i get null everytime. BQ doesn't let me change
format of date of Col2 works well with col1. Another issue is Date_Col2 is entered with both single and double digits for month value so cant use concat or substring either. Also sometimes col2 is entered as null.
i guess nulls can be replaced with 0.
I was wondering if any one has worked on this use case.
Below is the example of calculating business days between two dates in different formats. It works for other dates but not for Vitals Date(Date_col2 with different format)
(DATEDIFF(TIMESTAMP(hp.ARRIVAL_TIME_PAC_TZ), TIMESTAMP(Vitals_date)) + 1)
-(INTEGER((DATEDIFF(TIMESTAMP(hp.ARRIVAL_TIME_PAC_TZ), (TIMESTAMP(Vitals_date))) + 1) / 7) * 2)
-(CASE WHEN DAYOFWEEK(TIMESTAMP(Vitals_date)) = 1 THEN 1 ELSE 0 END)
-(CASE WHEN DAYOFWEEK(TIMESTAMP(hp.ARRIVAL_TIME_PAC_TZ)) = 7 THEN 1 ELSE 0 END) as AGING_GUTS_ARRIVAL_Vitals_date,
To convert col2 into timestamp, you could use the following:
timestamp(concat(
regexp_extract(col2, r"\d+/\d+/(\d+)"), "/",
regexp_extract(col2, r"(\d+)/\d+/\d+"), "/",
regexp_extract(col2, r"\d+/(\d+)/\d+"),
regexp_extract(col2, r"\d+/\d+/\d+(.*)"), ":00"))
It should work with 1 or 2 digits for month
I am working on a table where the age of a person is in a string field where it is in the following format: (amount UnitOfMeasurement)
1 year old = 1 y
11 months old = 11 m
5 Days old = 5 d
I am trying to do a search between a range of age. Is is possible to this via a SQL query where it would order the days (d) first, then months (m), and years (y)?
The database is on SQL Server 2008, but the query will probably be done on Access as it is used for a report's record source.
The first thing I'd do in your situation is try to clean up the messy age field, and standardise it. A quick start might be to create a query where you separate the age value and the age unit, by using expressions such as:
age_unit: Right([age], 1)
and
age_value: Val([age])
If you then sort by age_unit and age_value, you will get all ages sorted correctly (under the assumption that an age in days is always less than an age in months, which in turn is always less than an age in years). Note that you must sort by unit first, then value.
If you want to return ages between a certain minimum and maximum, it's not a problem if you're sticking to a single unit, such as all ages between 5 years and 15 years. Just enter "y" as a criteria under the "age_unit" field (assuming you're using the visual query builder here) and enter "Between 5 and 15" under the "age_value" field.
If you're mixing units ("all ages between 6 months and 2 years") it gets a little more complicated. In this case you'd need to do the following:
On one criteria row you'd enter the following values for each field:
age_unit: "m"
age_value: >=6
And then on the next criteria row:
age_unit: "y"
age_value: <=2
This will return all ages having unit "m" and a value >= 6 OR having unit "y" with a value <=2.
Another somewhat simpler solution would be to convert all ages to a standard unit such as years, by doing some simple calculations, e.g. divide "d" unit values by 365.25, and divide "m" unit values by 12. Then create a new field in your table for the new standardised age data.
Your best bet would be to create a new colum with a real DATETIME value in it. You could then write code, such as a CASE statement, to help convert the string into a DATETIME. Once completed, your calculations will become much simpler.
1.This field doesn't has atomic values. This means that your table is not in 1NF.
You should split Age field into 2 columns with atomic values: IntervalType(CHAR(1)... CHECK(IntervalType IN ('d','m','y')) and IntervalValue (INT; 1,2, etc).
So, instead of Table(...,Age) you can use Table(...,IntervalType,IntervalValue) and
SELECT *
,CONVERT(VARCHAR(10),IntervalValue)
+' '+CASE IntervalType WHEN 'd' THEN 'day' WHEN 'm' THEN 'month' WHEN 'y' THEN 'year' END
+CASE WHEN IntervalValue > 1 THEN 's' ELSE '' END
+' old = '
+CONVERT(VARCHAR(10),IntervalValue)
+' '+IntervalType
FROM table
2.How do you sort these two values: 30 d and 1 month ? One month can have from 28 to 31 days.
3.SQL Server solution:
DECLARE #TestData TABLE
(
Age VARCHAR(25) NOT NULL
,IntervalValue AS CONVERT(INT,LEFT(Age,CHARINDEX(' ',Age))) PERSISTED
,IntervalType AS RIGHT(Age,1) PERSISTED
);
INSERT #TestData
VALUES
('1 year old = 1 y')
,('2 years old = 2 y')
,('11 months old = 11 m')
,('30 Days old = 30 d')
,('5 Days old = 5 d');
SELECT *
FROM #TestData a
ORDER BY a.IntervalType, a.IntervalValue;