How to find increasing volumes in sql server - sql

Below is the sample data
create table #sample (id int,Spenddate datetime, Balance int)
insert into #sample values (1,getdate(),100)
insert into #sample values (1,getdate()+1,98)
insert into #sample values (1,getdate()+2,50)
insert into #sample values (1,getdate()+3,0)
insert into #sample values (1,getdate()+5,20)
insert into #sample values (1,getdate()+6,25)
insert into #sample values (1,getdate()+7,30)
insert into #sample values (1,getdate()+8,40)
insert into #sample values (1,getdate()+9,55)
I need to find the continuous increases ...
Actually the real time situation is,
Instead of debiting, somehow system mistakenly crediting the amount to the customer after balance went to zero. So I need to find affected records.
Thanks

you can do this using lag and lead function in sql server 2012
http://www.mssqltips.com/sqlservertip/2877/sql-server-analysis-serviceslead-lag-openingperiod-closingperiod-time-related-functions/
share|edit

What you need is the lag() function, But you don't have it in SQL Server 2008. You can get the same effect with a correlated subquery or cross apply. So the same query should get all records that have a larger balance than the previously reported balance:
select *
from (select v.*,
(select top 1 v2.balance
from #values v2
where v2.id = v.id and v2.SpendDate < v.SpendDate
order by v2.SpendDate desc
) as prev_balance
from #values v
) v
where prev_balance < balance;

you can do this using lag and lead function in sql server 2012
http://www.mssqltips.com/sqlservertip/2877/sql-server-analysis-serviceslead-lag-openingperiod-closingperiod-time-related-functions/

Related

Find max value from coalesce function

This is my query:
declare #t table (date1 date,date2 date,date3 date)
insert into #t values ('2019-01-01','2019-01-20','2019-02-10')
insert into #t values (null,null,'2019-02-01')
insert into #t values (null,'2019-02-01','2019-02-02')
My expected output is:
2019-02-10
2019-02-01
2019-02-02
I tried to use coalesce like :
select coalesce(date1,date2,date3) as maxdate from #t
I know coalesce returns first not null value. So what I can do to get my desired result.
This will do the trick.
Basically you transform every row in a data-set, using VALUES clause, and then just get the MAX value.
SELECT (
SELECT MAX(LastUpdateDate)
FROM (VALUES (date1),(date2),(date3)) AS UpdateDate(LastUpdateDate)) AS LastUpdateDate
FROM #t
coalesce() has nothing to do with this. Unfortunately, SQL Server does not support greatest(). But you can use apply:
select t.*, m.max_date
from #t t cross apply
(select max(dte) as max_date
from (values (t.date1), (t.date2), (t.date3)) v(dte)
) m;
The max() ignores NULL values, so this does what you expect.
Here is a db<>fiddle.
Personally I would normalise your data, by unpivoting it, and then getting the MAX in the group. This does, however, require you have some kind of column to identify the row (I use an IDENTITY in this example):
DECLARE #t table (id int IDENTITY,
date1 date,
date2 date,
date3 date);
INSERT INTO #t
VALUES ('2019-01-01', '2019-01-20', '2019-02-10');
INSERT INTO #t
VALUES (NULL, NULL, '2019-02-01');
INSERT INTO #t
VALUES (NULL, '2019-02-01', '2019-02-02');
SELECT MAX([date])
FROM #t t
CROSS APPLY (VALUES(t.date1),(date2),(t.date3))V([date])
GROUP BY t.id;

SQL Server - Partition by or Row num or Subquery - Assistance

I have the below problem.
I am trying to see how often a customer has requested Re-Activation of their Internet account.
The problem is, we capture a limited set of data to group on.
So my data set is below.
I am trying to Count from the first time a Re-Activation request was created until the First time it was COMPLETED, once it has been completed finish the count of days it took for the request to complete and count the number of NON COMPLETIONS and SENT statuses which occurred between that time.
Below is an image of the sample data as well as the sql for the table.
Hope somebody can provide a little help.
(using SQL server 2005 compatibility)
http://imgur.com/a/9yCJm
enter image description here
CREATE TABLE #temp
(
Identifier varchar(20)NOT NULL
,CreatedDate DATETIME NOT NULL
,CompletedDate DATETIME NOT NULL
,SN_Type varchar(20) NOT NULL
,SN_Status varchar(20) NOT NULL
)
;
INSERT INTO #temp
VALUES('64074558792','20160729','20160805','Re-Activattion','SENT');
INSERT INTO #temp
VALUES('64074558792','20160810','20160810','Re-Activattion','N-CO');
INSERT INTO #temp
VALUES('64074558792','20160812','20160812','Re-Activattion','N-CO');
INSERT INTO #temp
VALUES('64074558792','20160811','20160811','Re-Activattion','COMP');
INSERT INTO #temp
VALUES('64074558792','20160811','20160813','Re-Activattion','N-CO');
INSERT INTO #temp
VALUES ('61030203647','20160427','20160427','Re-Activattion', 'COMP');
INSERT INTO #temp
VALUES('61030203647','20160427','20160427','Re-Activattion', 'N-CO');
INSERT INTO #temp
VALUES('61030203647','20160422','20160422','Re-Activattion', 'N-CO');
INSERT INTO #temp
VALUES('61030203647','20170210','20170210','Re-Activattion', 'COMP');
INSERT INTO #temp
VALUES('61030203688','20170409','20170210','Re-Activattion', 'SENT');
INSERT INTO #temp
VALUES('61030203699','20170409','20170210','De-Activattion', 'COMP');

Tally Table in SQL

I want to create a bunch of data with Tally table in SQL (sql2008) and definitely need help.
First of all, I have this table which contains 2 columns.
{
AcctNum (nchar(30), null),
DataInfo (nchar(745), null)
}
While I don't care the data in the DataInfo column, I do want to add about 10k of row into the table with unique AcctNum on each row.
The problem though is I need to keep the length of the data in both column. For example, AcctNum column looks like "400000000000001 ". how do I increment the number while keep the "blank space"?
Not sure if I make much sense here, but please let me know and I will try to explain more, thanks!
Using a recursive common table expression :
-- set up a table variable for demo purpose
declare #t table (AcctNum nchar(30) null, DataInfo nchar(745) null);
-- insert the starting value
insert #t values ('400000000000001', null);
-- run the cte to generate the sequence
with cte (acctnum, num) as (
select acctnum, cast(acctnum as bigint) + 1 num -- starting value
from #t
union all
select acctnum, num+1 from cte
where num < cast(acctnum as bigint) + 10000 -- stopping value
)
-- insert data sequence into the table
insert #t (AcctNum, DataInfo)
select num, null from cte
option (maxrecursion 10000);
select * from #t;
The table variable #t will now contain acctnum 400000000000001 -> 400000000010001 as a contiguous sequence.

How can I insert random values into a SQL Server table?

I'm trying to randomly insert values from a list of pre-defined values into a table for testing. I tried using the solution found on this StackOverflow question:
stackoverflow.com/.../update-sql-table-with-random-value-from-other-table
When I I tried this, all of my "random" values that are inserted are exactly the same for all 3000 records.
When I run the part of the query that actually selects the random row, it does select a random record every time I run it by hand, so I know the query works. My best guesses as to what is happening are:
SQL Server is optimizing the SELECT somehow, not allowing the subquery to be evaluated more than once
The random value's seed is the same on every record the query updates
I'm stuck on what my options are. Am I doing something wrong, or is there another way I should be doing this?
This is the code I'm using:
DECLARE #randomStuff TABLE ([id] INT, [val] VARCHAR(100))
INSERT INTO #randomStuff ([id], [val])
VALUES ( 1, 'Test Value 1' )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 2, 'Test Value 2' )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 3, 'Test Value 3' )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 4, 'Test Value 4' )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 5, 'Test Value 5' )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 6, null )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 7, null )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 8, null )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 9, null )
INSERT INTO #randomStuff ([id], [val])
VALUES ( 10, null )
UPDATE MyTable
SET MyColumn = (SELECT TOP 1 [val] FROM #randomStuff ORDER BY NEWID())
When the query engine sees this...
(SELECT TOP 1 [val] FROM #randomStuff ORDER BY NEWID())
... it's all like, "ooooh, a cachable scalar subquery, I'm gonna cache that!"
You need to trick the query engine into thinking it's non-cachable. jfar's answer was close, but the query engine was smart enough to see the tautalogy of MyTable.MyColumn = MyTable.MyColumn, but it ain't smart enough to see through this.
UPDATE MyTable
SET MyColumn = (SELECT TOP 1 val
FROM #randomStuff r
INNER JOIN MyTable _MT
ON M.Id = _MT.Id
ORDER BY NEWID())
FROM MyTable M
By bringing in the outer table (MT) into the subquery, the query engine assumes subquery will need to be re-evaluated. Anything will work really, but I went with the (assumed) primary key of MyTable.Id since it'd be indexed and would add very little overhead.
A cursor would probably be just as fast, but is most certainly not as fun.
use a cross join to generate random data
I've had a play with this, and found a rather hacky way to do it with the use of an intermediate table variable.
Once #randomStuff is set up, we do this (note in my case, #MyTable is a table variable, adjust accordingly for your normal table):
DECLARE #randomMappings TABLE (id INT, val VARCHAR(100), sorter UNIQUEIDENTIFIER)
INSERT INTO #randomMappings
SELECT M.id, val, NEWID() AS sort
FROM #MyTable AS M
CROSS JOIN #randomstuff
so at this point, we have an intermediate table with every combination of (mytable id, random value), and a random sort value for each row specific to that combination. Then
DELETE others FROM #randomMappings AS others
INNER JOIN #randomMappings AS lower
ON (lower.id = others.id) AND (lower.sorter < others.sorter)
This is an old trick which deletes all rows for a given MyTable.id except for the one with the lower sort value -- join the table to itself where the value is smaller, and delete any where such a join succeeded. This just leaves behind the lowest value. So for each MyTable.id, we just have one (random) value left.. Then we just plug it back into the table:
UPDATE #MyTable
SET MyColumn = random.val
FROM #MyTable m, #randomMappings AS random
WHERE (random.id = m.id)
And you're done!
I said it was hacky...
I don't have time to check this right now, but my gut tells me that if you were to create a function on the server to get the random value that it would not optimize it out.
then you would have
UPDATE MyTable
Set MyColumn = dbo.RANDOM_VALUE()
There is no optimization going on here.
Your using a subquery that selects a single value, there is nothing to optimize.
You can also try putting a column from the table your updating in the select and see if that changes anything. That may trigger an evaluation for every row in MyTable
UPDATE MyTable
SET MyColumn = (SELECT TOP 1 [val] FROM #randomStuff ORDER BY NEWID()
WHERE MyTable.MyColumn = MyTable.MyColumn )
I came up with a solution which is a bit of a hack and very inefficient (10~ seconds to update 3000 records). Because this is being used to generate test data, I don't have to be concerned about speed however.
In this solution, I iterate over every row in the table and update the values one row at a time. It seems to work:
DECLARE #rows INT
DECLARE #currentRow INT
SELECT #rows = COUNT(*) FROM dbo.MyTable
SET #currentRow = 1
WHILE #currentRow < #rows
BEGIN
UPDATE MyTable
SET MyColumn = (SELECT TOP 1 [val] FROM #randomStuff ORDER BY NEWID())
WHERE MyPrimaryKey = (SELECT b.MyPrimaryKey
FROM(SELECT a.MyPrimaryKey, ROW_NUMBER() OVER (ORDER BY MyPrimaryKey) AS rownumber
FROM MyTable a) AS b
WHERE #currentRow = b.rownumber
)
SET #currentRow = #currentRow + 1
END

Pseudo Random Repeatable Sort in SQL Server (not NEWID() and not RAND())

I would like to randomly sort a result in a repeatable fashion for purposes such as paging. For this NEWID() is too random in that the same results cannot be re-obtained. Order by Rand(seed) would be ideal as with the same seed the same random collection would result. Unfortunately, the Rand() state resets with every row, does anyone have a solution?
declare #seed as int;
set #seed = 1000;
create table temp (
id int,
date datetime)
insert into temp (id, date) values (1,'20090119')
insert into temp (id, date) values (2,'20090118')
insert into temp (id, date) values (3,'20090117')
insert into temp (id, date) values (4,'20090116')
insert into temp (id, date) values (5,'20090115')
insert into temp (id, date) values (6,'20090114')
-- re-seeds for every item
select *, RAND(), RAND(id+#seed) as r from temp order by r
--1 2009-01-19 00:00:00.000 0.277720118060575 0.732224964471124
--2 2009-01-18 00:00:00.000 0.277720118060575 0.732243597442382
--3 2009-01-17 00:00:00.000 0.277720118060575 0.73226223041364
--4 2009-01-16 00:00:00.000 0.277720118060575 0.732280863384898
--5 2009-01-15 00:00:00.000 0.277720118060575 0.732299496356156
--6 2009-01-14 00:00:00.000 0.277720118060575 0.732318129327415
-- Note how the last column is +=~0.00002
drop table temp
-- interestingly this works:
select RAND(#seed), RAND()
--0.732206331499865 0.306382810665955
Note, I tried Rand(ID) but that just turns out to be sorted. Apparently Rand(n) < Rand(n+1)
Building off of gkrogers hash suggestion this works great. Any thoughts on performance?
declare #seed as int;
set #seed = 10;
create table temp (
id int,
date datetime)
insert into temp (id, date) values (1,'20090119')
insert into temp (id, date) values (2,'20090118')
insert into temp (id, date) values (3,'20090117')
insert into temp (id, date) values (4,'20090116')
insert into temp (id, date) values (5,'20090115')
insert into temp (id, date) values (6,'20090114')
-- re-seeds for every item
select *, HASHBYTES('md5',cast(id+#seed as varchar)) r
from temp order by r
--1 2009-01-19 00:00:00.000 0x6512BD43D9CAA6E02C990B0A82652DCA
--5 2009-01-15 00:00:00.000 0x9BF31C7FF062936A96D3C8BD1F8F2FF3
--4 2009-01-16 00:00:00.000 0xAAB3238922BCC25A6F606EB525FFDC56
--2 2009-01-18 00:00:00.000 0xC20AD4D76FE97759AA27A0C99BFF6710
--3 2009-01-17 00:00:00.000 0xC51CE410C124A10E0DB5E4B97FC2AF39
--6 2009-01-14 00:00:00.000 0xC74D97B01EAE257E44AA9D5BADE97BAF
drop table temp
EDIT: Note, the declaration of #seed as it's use in the query could be replace with a parameter or with a constant int if dynamic SQL is used. (declaration of #int in a TSQL fashion is not necessary)
You can use a value from each row to re-evaluate the rand function:
Select *, Rand(#seed + id) as r from temp order by r
adding the ID ensures that the rand is reseeded for each row. But for a value of seed you will always get back the same sequence of rows (provided that the table does not change)
Creating a hash can be much more time consuming than creating a seeded random number.
To get more variation in the ourput of RAND([seed]) you need to make the [seed] vary significantly too. Possibly such as...
SELECT
*,
RAND(id * 9999) AS [r]
FROM
temp
ORDER BY
r
Using a constant ensures the replicability you asked for. But be careful of the result of (id * 9999) causing an overflow if you expect your table to get big enough...
SELECT *, checksum(id) AS r FROM table ORDER BY r
This kind of works. Although the output from checksum() does not look all that random to me. The MSDN Documentation states:
[...], we do not recommend using CHECKSUM to detect whether values have changed, unless your application can tolerate occasionally missing a change. Consider using HashBytes instead. When an MD5 hash algorithm is specified, the probability of HashBytes returning the same result for two different inputs is much lower than that of CHECKSUM.
But may be it faster.
After doing some reading this is an accepted method.
Select Rand(#seed) -- now rand is seeded
Select *, 0 * id + Rand() as r from temp order by r
Having id in the expression causes it to be reevaluated every row. But multiplying it by 0 ensures that it doesnt not affect the outcome of rand.
What a horrible way of doing things!
create table temp (
id int,
date datetime)
insert into temp (id, date) values (1,'20090119')
insert into temp (id, date) values (2,'20090118')
insert into temp (id, date) values (3,'20090117')
insert into temp (id, date) values (4,'20090116')
insert into temp (id, date) values (5,'20090115')
insert into temp (id, date) values (6,'20090114')
-- re-seeds for every item
select *, NEWID() r
from temp order by r
drop table temp
This has worked well for me in the past, and it can be applied to any table (just bolt on the ORDER BY clause):
SELECT *
FROM MY_TABLE
ORDER BY
(SELECT ABS(CAST(NEWID() AS BINARY(6)) % 1000) + 1);