Computed column formula ( yyMMdd## )

Computed column formula ( yyMMdd## ) - sql

I need a computed column formula that gives me this yyMMdd##.
I have an identity column (DataID) and a date column (DataDate).
This what I have so far.
(((right(CONVERT([varchar](4),datepart(year,[DataDate]),0),(2))+
right(CONVERT([varchar](4),datepart(month,[DataDate]),0),(2)))+
right(CONVERT([varchar](4),datepart(day,[DataDate]),0),(2)))+
right('00'+CONVERT([varchar](2),[DataID],0),(2)))
And this gives me:
12111201
12111202
12111303
12111304
12111405
12111406
12111407
12111508
What I want is:
12111201
12111202
12111301
12111302
12111401
12111402
12111403
12111501

I'm assuming you want to have a sequence starting at 1 for each date - right? If not: please explain what you really want / need.
You won't be able to do this with a IDENTITY column and a computed column specification. An IDENTITY column returns constantly increasing numbers.
What you could do is not store those values on disk - but instead use CTE and the ROW_NUMBER() OVER (PARTITION BY....) construct to create those numbers on the fly - whenever you need to select them. Or have a job that sets those values based on such a CTE on a regular basis (e.g. once every hour or so).
That CTE might look something like this - again, assuming that DataDate is indeed of type DATE (and not DATETIME or something like that) :
;WITH CTE AS
(
SELECT
DataID, DataDate,
RowNum = ROW_NUMBER() OVER (PARTITION BY DataDate ORDER BY DataID)
FROM
dbo.YourTable
)
SELECT
DataID, DataDate, RowNum
FROM
CTE

Related

SQL specific column update time in system-versioned temporal table

Is there any easy way to get information about the last update date of a selected column using system-versioned temporal table?
I have a table with columns A, B, C, each of them is updated randomly and separately, but I am interested in whether it is able to easily extract the date of the last update in column B.
I added a photo for the sake of simplicity, I need to extract information when there was the last change in the value in column A (in the photo I marked the last change in this column)

Use the lag() window function to look for changes in B, summarize that set to find max(StartTime), and use that in a Where filter to select your latest record.
Select * From history.table
Where StartTime=(Select max(StartTime) from
( Select *,
B<>lag(B) Over (Order By StartTime) as B_Changed
From history.table
)
Where B_Changed
)

I was able to find a simple solution that solves each case, below is the solution
SELECT TOP (1) * FROM (
SELECT
ID,
A,
LAG(A) OVER(PARTITION BY ID ORDER BY StartTime) AS PreviousA,
UserID,
StartTime
FROM dbo.Table FOR SYSTEM_TIME ALL
)t
WHERE t.A <> t.PreviousA
ORDER BY t.StartTime desc
The query returns the last modification in the column, if there was no modification in the table or only another column was modified, it correctly returns an empty row informing that there were no changes. Maybe someone will need it in the future. Thank you for your help.

Copy column values using a partition by statement in BigQuery

In BigQuery, I am trying to copy column values into other rows using a PARTITION BY statement anchored to a particular ID number.
Here is an example:
Right now, I am trying to use:
MIN(col_a) OVER (PARTITION BY CAST(id AS STRING), date ORDER BY date) AS col_b
It doesn't seem like the aggregate function is working properly. As in, the "col_b" still has null values when I try this method. Am I misunderstanding how aggregate functions work?

You can use this:
MIN(col_a) OVER (PARTITION BY id) AS col_b
If you have one value per id, this will return that value.
Note that converting id to a string is unnecessary. Also, you don't need a cumulative minimum, hence no ORDER BY.

Another option using coalesce
select *, coalesce(col_a, (select min(col_a) from my_table b where a.id=b.id)) col_b
from my_table a;
DEMO

I want to generate unique ids while inserting into Bigquery table.

I want to generate unique ids while inserting into Bigquery table. ROW_NUMBER()OVER() fails with resources exceeded. Forums recommend using ROW_NUMBER()OVER(PARTITION BY). Unfortunately, partition by can't be used as it may produce same row_numbers for the partition by key. Please note that the data that I am trying to insert is at least few hundreds of millions every day.

Unfortunately, partition by can't be used as it may produce same row_numbers for the partition by key
yes - you will get same numbers for different partitions - so you can just use compound key like in below much simplified example - just to show approach - you should be able to tweak it to your specific case
#standardSQL
WITH `project.dataset.table` AS (
SELECT value, CAST(10*RAND() AS INT64) partitionid
FROM UNNEST(GENERATE_ARRAY(1, 100)) value
)
SELECT
partitionid,
value,
CONCAT(
CAST(1000 + partitionid AS STRING),
CAST(10000 + ROW_NUMBER() OVER(PARTITION BY partitionid ORDER BY value) AS STRING)
) id
FROM `project.dataset.table`
-- ORDER BY id

Select rows by index in Amazon Athena

This is a very simple question but I can't seem to find documentation on it. How one would query rows by index (ie select the 10th through 20th row in a table)?
I know there's a row_numbers function but it doesn't seem to do what I want.

Do not specify any partition so your row number will be an integer between 1 and your number of record.
SELECT row_num FROM (
SELECT row_number() over () as row_num
FROM your_table
)
WHERE row_num between 100000 and 100010

I seem to have found a roundabout and clunky way of doing this in Athena, so any better answers are welcome. This approach requires you have some numeric column in your table already, in this case named some_numeric_column:
SELECT some_numeric_column, row_num FROM (
SELECT some_numeric_column,
row_number() over (order by some_numeric_column) as row_num
FROM your_table
)
WHERE row_num between 100000 and 100010
To explain, you first select some numeric column in your data, then create a column (called row_num) of row numbers which is based on the order of your selected numeric column. Then you wrap that all in a select call because Athena doesn't support creating and then conditioning on the row_num column within a single call. If you don't wrap it in a second SELECT call Athena will spit out some errors about not finding a column named row_num.

inserting data sequentially into a single column

I want to insert sequential numbers into a single column.
The columns have been set to NULL
Is there a command I can now use to populate just that column with sequential numbers
Thanks

you can use row_number analytic function to generate sequence number, that can be used to update the values.
WITH cte
AS
( SELECT ROW_NUMBER() OVER ( ORDER BY (SELECT null)) as rn , column1
FROM table1
)
UPDATE cte
SET column1 = rn

Based on your comment:
that almost did it but it would be nice if we can get it to sequence
properly for instance like 0000001, 0000002,0000003 etc up to 7
characters
I think this is what you are trying to do:
SELECT RIGHT('0000000' + CONVERT(VARCHAR, ROW_NUMBER() OVER ( ORDER BY COLUMN_NAME)), 7) AS ROW_NUM
FROM TABLE_NAME
Update after comment:
Basically this is what I am trying to do. I will be running a query to
grab those 7 character ID from that ALTATH column that will be used as
Reference ID so the column should have those output or type in
it..hope that's clear...thanks
I would personally use an Identity column and format the output when selecting records like this:
SELECT RIGHT('0000000' + CONVERT(VARCHAR, ID_COLUMN), 7) AS ROW_NUM
FROM TABLE_NAME
If you use the analytical functions the ROW_NUM column will not be the same everytime. If you use Identity SQL Server will take care of assigning the numbers in sequence for you. That is the DBMS's job.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Computed column formula ( yyMMdd## ) - sql

Related

SQL specific column update time in system-versioned temporal table

Copy column values using a partition by statement in BigQuery

I want to generate unique ids while inserting into Bigquery table.

Select rows by index in Amazon Athena

inserting data sequentially into a single column

Categories

Resources