Related
I have a scenario where I need to select data based on month-end.
Raw data looks like:
ID
Date
Cost
IS_REVERSED
Reverse_ID
1
2021-01-01
$1
No
NULL
2
2021-01-30
$2
YES
NULL
3
2021-02-01
$3
NULL
2
4
2021-02-03
$4
No
NULL
Please note the IS_REVERSED flag column and Reverse_ID column. If the transaction is successful in the first attempt the flag is NO, But if the transaction is successful in the second attempt the flag is NULL
My desired output if I run the report for January end it should bring all transactions that happened in Jan (even if it reversed is Yes but reversal has not yet happened)
ID 1, 2
For next month-end, I need to report data from Jan and Feb combined. and the desired output should be
ID 1, 3 and 4
Id 2 should not be reported in because that has a reverse flag of Yes
Any pointers to achieve this would be much appreciated.
Create table Test_Report
(ID Int
,[Date] date
,Cost varchar(100)
,Is_reversed varchar(100)
,Reversed_ID int)
insert into Test_Report values
(1 ,'2021-01-01', '$1' , 'No', NULL),
(2 ,'2021-01-30', '$2' , 'YES', NULL),
(3 ,'2021-02-01', '$3' , NULL, 2),
(4 ,'2021-02-03', '$4' , 'No', NULL)
I need a single query where i can pass [date] as a condition to filter out records. (date < '2021-01-31' should bring id 1,2 )
(date <'2021-02-28' should bring id 1,3,4) ID 2 should not come as the transaction is reversed and we have a new transaction (Id 3) with IS_REVERSE flag as NULL.
Thanks
I used a dynamic SQL query to achieve the results. Since your data is not clear and I see the datatypes are not correct, assumptions are used. So, DML and DDL are included with the answer
Create table Test_Report
(ID Int
,[Date] varchar(100)
,Cost varchar(100)
,Is_reversed varchar(100)
,Reversed_ID int)
insert into Test_Report values (1 ,'01 Jan', '$1' , 'No', NULL)
,(1 ,'30 Jan', '$2' , 'YES', NULL)
,(1 ,'01 Feb', '$3' , NULL, 2)
,(1 ,'03 Feb', '$4' , 'No', NULL)
declare #from varchar(100) = 'Jan'
, #to varchar(100) = 'Feb'
declare #where varchar(max)
declare #query varchar(max)
set #query = 'select * from Test_Report where '
if #from = #to
begin
set #where = ' [Date] like ''%'+#from+ '%'' and ( isnull(Is_reversed,'''') in (''No'', ''YES'', '''')) and Reversed_ID is null'
end
else
begin
set #where = ' ( [Date] like ''%'+#from+ '%'' or [Date] like ''%'+#to+ '%'' ) and ( isnull(Is_reversed,'''') in (''No'', ''''))'
end
Declare #Main varchar(max) = #query + #where
--This statement can be used to test the resulted query
--select #Main
exec(#main)
I am assuming you need to impute the date of the report you wish to run. In this case I would declare a variable for your WHERE clause.
This variable outputs the current date, but you can replace the GETDATE() syntax with a date value to make the query dynamic. Now the WHERE clause...
DECLARE #DATEVAR AS DATE = GETDATE()
SELECT *
,EOMONTH(#Test_Report.[Date],0) [EOMONTH for WHERE clause]
FROM #Test_Report
WHERE (Is_reversed != 'YES'
OR Is_reversed IS NULL)
AND #Test_Report.[Date] <= EOMONTH(#DATEVAR,0)
The EOMONTH() function can be used here to evaluate the end of the month selected, and select all records less than that month-end date. Since you want both records where Is_reversed is not YES and is NULL, we need to put an OR clause in parenthesis.
Desired Output:
A more clever way to deal with the nulls is the ISNULL(,) function. You can use it in your WHERE clause to make that constraint a 1-liner.
SELECT *
,EOMONTH(#Test_Report.[Date],0) [EOMONTH for WHERE clause]
FROM #Test_Report
WHERE ISNULL(Is_reversed,'No') != 'YES'
AND #Test_Report.[Date] <= EOMONTH(#DATEVAR,0)
Either method should do the trick!
I am attempting to build a simple lookup table that will have its own web page display. I am providing simple search variables, and one of those variables is Status. I am looking for them to be able to Choose from Active, Inactive, or both, I am looking to easily look for both at once. There are other status in the database such as 'D' for a soft delete that I do not want returned at all.
Declare #stat nvarchar(5) = 3
Select [Status]
from tableUser
where [Status] in (CASE #stat
WHEN 1 THEN 'A'
WHEN 2 THEN 'I'
WHEN 3 THEN 'A','I'
END)
The above is what I have tried.
Just use boolean logic:
WHERE (#stat IN (1, 3) AND Status = 'A') OR
(#stat IN (2, 3) AND Status = 'I')
You want the CASE statement to produce an expression, but that's not how it works. CASE statements produce values (which can be included as expressions, but the expressions must still reduce to values at query compile time). 'A','I' does not reduce to a value, so you cannot use it as the result of a CASE statement.
Instead, write the condition more like this:
WHERE 1 = CASE WHEN #stat = 1 AND [Status] = 'A' THEN 1
WHEN #stat = 2 AND [Status] = 'I' THEN 1
WHEN #stat = 3 AND [Status] IN ('A', 'I') THEN 1
ELSE 0 END
or remove the CASE expressions and build all that directly into the WHERE clause:
WHERE ( (#stat = 1 AND [Status] = 'A')
OR (#stat = 2 AND [Status] = 'I')
OR (#stat = 3 AND [Status] IN ('A', 'I'))
)
I receive raw data files from external sources and need to provide analysis on them. I load the files into a table & set the fields as varchars, then run a complex SQL script that does some automated analysis. One issue I've been trying to resolve is: How to tell if a column of data is duplicated with 1 or more other columns in that same table?
My goal is to have, for every column, a hash, checksum, or something similar that looks at a column's values in every row in the order they come in. I have dynamic SQL that loops through every field (different tables will have a variable number of columns) based on the fields listed in INFORMATION_SCHEMA.COLUMNS, so no concerns on how to accomplish that part.
I've been researching this all day but can't seem to find any sensible way to hash every row of a field. Google & StackOverflow searches return how to do various things to rows of data, but I couldn't find much on how to do the same thing vertically on a field.
So, I considered 2 possibilities & hit 2 roadblocks:
HASHBYTES - Use 'FOR XML PATH' (or similar) to grab every row & use a delimiter between each row, then use HASHBYTES to hash the long string. Unfortunately, this won't work for me since I'm running SQL Server 2014, and HASHBYTES is limited to an input of 8000 characters. (I can also imagine performance would be abysmal on tables with millions of rows, looped for 200+ columns).
CHECKSUM + CHECKSUM_AGG - Get the CHECKSUM of each value, turning it into an integer, then use CHECKSUM_AGG on the results (since CHECKSUM_AGG needs integers). This looks promising, but the order of the data is not considered, returning the same value on different rows. Plus the risk of collisions is higher.
The second looked promising but doesn't work as I had hoped...
declare #t1 table
(col_1 varchar(5)
, col_2 varchar(5)
, col_3 varchar(5));
insert into #t1
values ('ABC', 'ABC', 'ABC')
, ('ABC', 'ABC', 'BCD')
, ('BCD', 'BCD', NULL)
, (NULL, NULL, 'ABC');
select * from #t1;
select cs_1 = CHECKSUM(col_1)
, cs_2 = CHECKSUM(col_2)
, cs_3 = CHECKSUM(col_3)
from #t1;
select csa_1 = CHECKSUM_AGG(CHECKSUM([col_1]))
, csa_2 = CHECKSUM_AGG(CHECKSUM([col_2]))
, csa_3 = CHECKSUM_AGG(CHECKSUM([col_3]))
from #t1;
In the last result set, all 3 columns bring back the same value: 2147449198.
Desired results: My goal is to have some code where csa_1 and csa_2 bring back the same value, while csa_3 brings back a different value, indicating that it's its own unique set.
You could compare every column combo in this way, rather than using hashes:
select case when count(case when column1 = column2 then 1 else null end) = count(1) then 1 else 0 end Column1EqualsColumn2
, case when count(case when column1 = column3 then 1 else null end) = count(1) then 1 else 0 end Column1EqualsColumn3
, case when count(case when column1 = column4 then 1 else null end) = count(1) then 1 else 0 end Column1EqualsColumn4
, case when count(case when column1 = column5 then 1 else null end) = count(1) then 1 else 0 end Column1EqualsColumn5
, case when count(case when column2 = column3 then 1 else null end) = count(1) then 1 else 0 end Column2EqualsColumn3
, case when count(case when column2 = column4 then 1 else null end) = count(1) then 1 else 0 end Column2EqualsColumn4
, case when count(case when column2 = column5 then 1 else null end) = count(1) then 1 else 0 end Column2EqualsColumn5
, case when count(case when column3 = column4 then 1 else null end) = count(1) then 1 else 0 end Column3EqualsColumn4
, case when count(case when column3 = column5 then 1 else null end) = count(1) then 1 else 0 end Column3EqualsColumn5
, case when count(case when column4 = column5 then 1 else null end) = count(1) then 1 else 0 end Column4EqualsColumn5
from myData a
Here's the setup code:
create table myData
(
id integer not null identity(1,1)
, column1 nvarchar (32)
, column2 nvarchar (32)
, column3 nvarchar (32)
, column4 nvarchar (32)
, column5 nvarchar (32)
)
insert myData (column1, column2, column3, column4, column5)
values ('hello', 'hello', 'no', 'match', 'match')
,('world', 'world', 'world', 'world', 'world')
,('repeat', 'repeat', 'repeat', 'repeat', 'repeat')
,('me', 'me', 'me', 'me', 'me')
And here's the obligatory SQL Fiddle.
Also, to save you having to write this here's some code to generate the above. This version will also include logic to handle scenarios where both columns' values are null:
declare #tableName sysname = 'myData'
, #sql nvarchar(max)
;with cte as (
select name, row_number() over (order by column_id) r
from sys.columns
where object_id = object_id(#tableName, 'U') --filter on our table
and name not in ('id') --only process for the columns we're interested in
)
select #sql = coalesce(#sql + char(10) + ', ', 'select') + ' case when count(case when ' + quotename(a.name) + ' = ' + quotename(b.name) + ' or (' + quotename(a.name) + ' is null and ' + quotename(b.name) + ' is null) then 1 else null end) = count(1) then 1 else 0 end ' + quotename(a.name + '_' + b.name)
from cte a
inner join cte b
on b.r > a.r
order by a.r, b.r
set #sql = #sql + char(10) + 'from ' + quotename(#tableName)
print #sql
NB: That's not to say you should run it as dynamic SQL; rather you can use this to generate your code (unless you need to support the scenario where the number or name of columns may vary at runtime, in which case you'd obviously want the dynamic option).
NEW SOLUTION
EDIT: Based on some new information, namely that there may be more than 200 columns, my suggestion is to compute hashes for each column, but perform it in the ETL tool.
Essentially, feed your data buffer through a transformation that computes a cryptographic hash of the previously-computed hash concatenated with the current column value. When you reach the end of the stream, you will have serially-generated hash values for each column, that are a proxy for the content and order of each set.
Then, you can compare each to all of the others almost instantly, as opposed to running 20,000 table scans.
OLD SOLUTION
Try this. Basically, you'll need a query like this to analyze each column against the others. There is not really a feasible hash-based solution. Just compare each set by its insertion order (some sort of row sequence number). Either generate this number during ingestion, or project it during retrieval, if you have a computationally-feasible means of doing so.
NOTE: I took liberties with the NULL here, comparing it as an empty string.
declare #t1 table
(
rownum int identity(1,1)
, col_1 varchar(5)
, col_2 varchar(5)
, col_3 varchar(5));
insert into #t1
values ('ABC', 'ABC', 'ABC')
, ('ABC', 'ABC', 'BCD')
, ('BCD', 'BCD', NULL)
, (NULL, NULL, 'ABC');
with col_1_sets as
(
select
t1.rownum as col_1_rownum
, CASE WHEN t2.rownum IS NULL THEN 1 ELSE 0 END AS col_2_miss
, CASE WHEN t3.rownum IS NULL THEN 1 ELSE 0 END AS col_3_miss
from
#t1 as t1
left join #t1 as t2 on
t1.rownum = t2.rownum
AND isnull(t1.col_1, '') = isnull(t2.col_2, '')
left join #t1 as t3 on
t1.rownum = t3.rownum
AND isnull(t1.col_1, '') = isnull(t2.col_3, '')
),
col_1_misses as
(
select
SUM(col_2_miss) as col_2_misses
, SUM(col_3_miss) as col_3_misses
from
col_1_sets
)
select
'col_1' as column_name
, CASE WHEN col_2_misses = 0 THEN 1 ELSE 0 END AS is_col_2_match
, CASE WHEN col_3_misses = 0 THEN 1 ELSE 0 END AS is_col_3_match
from
col_1_misses
Results:
+-------------+----------------+----------------+
| column_name | is_col_2_match | is_col_3_match |
+-------------+----------------+----------------+
| col_1 | 1 | 0 |
+-------------+----------------+----------------+
I have two tables named Retail and Activity and the data is as shown below:
Retail Table
Activity Table
My main concern is about Ok and Fault column of the table Retail, as you can see it contains comma separated value of ActivityId.
What i want is, if the Ok column has ActivityId the corresponding column will have Yes, if the Fault column has ActivityId then it should be marked as No
Note I have only four columns that is fixed, it means i have to check that either four of the columns has its value in Ok or Fault, if yes then only i have to print yes or no, otherwise null.
Desired result should be like :
If the value is in Ok then yes other wise No.
I guessing you want to store 'yes' or 'No' in some column. Below is the query to update that column :
UPDATE RetailTable
SET <Result_Column>=
CASE
WHEN Ok IS NOT NULL THEN 'Yes'
WHEN Fault IS NOT NULL THEN 'No'
END
You can use below code as staring point:
DECLARE #Retail TABLE
(
PhoneAuditID INT,
HandsetQuoteID INT,
Ok VARCHAR(50)
)
INSERT INTO #Retail VALUES (1, 1009228, '4,22,5')
INSERT INTO #Retail VALUES (2, 1009229, '1')
DECLARE #Activity TABLE
(
ID INT,
Activity VARCHAR(50)
)
INSERT INTO #Activity VALUES (1, 'BatteryOK?'), (4, 'PhonePowersUp?'), (22,'SomeOtherQuestion?'), (5,'LCD works OK?')
SELECT R.[PhoneAuditID], R.[HandsetQuoteID], A.[Activity], [Ok] = CASE WHEN A.[ID] IS NOT NULL THEN 'Yes' END
FROM #Retail R
CROSS APPLY dbo.Split(R.Ok, ',') S
LEFT JOIN #Activity A ON S.[items] = A.[ID]
I have used Split function provided here:
separate comma separated values and store in table in sql server
Try following query. i have used pivot to show row as columns. I have also used split function to split id values which you can find easily on net:
CREATE TABLE PhoneAudit
(
PhoneAuditRetailID INT,
HandsetQuoteID INT,
Ok VARCHAR(50),
Fault VARCHAR(50)
)
INSERT INTO PhoneAudit VALUES (1,10090,'1,2','3')
CREATE TABLE ActivityT
(
ID INT,
Activity VARCHAR(100)
)
INSERT INTO ActivityT VALUES (1,'Battery')
INSERT INTO ActivityT VALUES (2,'HasCharger')
INSERT INTO ActivityT VALUES (3,'HasMemoryCard')
INSERT INTO ActivityT VALUES (4,'Test')
DECLARE #SQL AS NVARCHAR(MAX)
DECLARE #ColumnName AS NVARCHAR(MAX)
SELECT #ColumnName= ISNULL(#ColumnName + ',','') + QUOTENAME(Activity) FROM (SELECT DISTINCT Activity FROM ActivityT) AS Activities
SET #SQL = 'SELECT PhoneAuditRetailID, HandsetQuoteID,
' + #ColumnName + '
FROM
(SELECT
t1.PhoneAuditRetailID,
t1.HandsetQuoteID,
TEMPOK.*
FROM
PhoneAudit t1
CROSS APPLY
(
SELECT
Activity,
(CASE WHEN ID IN (SELECT * FROM dbo.SplitIDs(t1.Ok,'',''))
THEN ''YES''
ELSE ''NO''
END) AS VALUE
FROM
ActivityT t2
) AS TEMPOK) AS t3
PIVOT
(
MIN(VALUE)
FOR Activity IN ('+ #ColumnName + ')
) AS PivotTable;'
EXEC sp_executesql #SQL
DROP TABLE PhoneAudit
DROP TABLE ActivityT
There are several ways to do this. If you are looking for a purely declarative approach, you could use a recursive CTE. The following example of this is presented as a generic solution with test data which should be adaptable to your needs:
Declare #Delimiter As Varchar(2)
Set #Delimiter = ','
Declare #Strings As Table
(
String Varchar(50)
)
Insert Into #Strings
Values
('12,345,6,78,9'),
(Null),
(''),
('123')
;With String_Columns As
(
Select
String,
Case
When String Is Null Then ''
When CharIndex(#Delimiter,String,0) = 0 Then ''
When Len(String) = 0 Then ''
Else Left(String,CharIndex(#Delimiter,String,0)-1)
End As String_Column,
Case
When String Is Null Then ''
When CharIndex(#Delimiter,String,0) = 0 Then ''
When Len(String) = 0 Then ''
When Len(Left(String,CharIndex(#Delimiter,String,0)-1)) = 0 Then ''
Else Right(String,Len(String)-Len(Left(String,CharIndex(#Delimiter,String,0)-1))-1)
End As Remainder,
1 As String_Column_Number
From
#Strings
Union All
Select
String,
Case
When CharIndex(#Delimiter,Remainder,0) = 0 Then Remainder
Else Left(Remainder,CharIndex(#Delimiter,Remainder,0)-1)
End As Remainder,
Case
When CharIndex(#Delimiter,Remainder,0) = 0 Then ''
When Len(Left(Remainder,CharIndex(#Delimiter,Remainder,0)-1)) = 0 Then ''
Else Right(Remainder,Len(Remainder)-Len(Left(Remainder,CharIndex(#Delimiter,Remainder,0)-1))-1)
End As Remainder,
String_Column_Number + 1
From
String_Columns
Where
(Remainder Is Not Null And Len(Remainder) > 1)
)
Select
String,
String_Column,
String_Column_Number
From
String_Columns
Imagine the following two tables:
create table MainTable (
MainId integer not null, -- This is the index
Data varchar(100) not null
)
create table OtherTable (
MainId integer not null, -- MainId, Name combined are the index.
Name varchar(100) not null,
Status tinyint not null
)
Now I want to select all the rows from MainTable, while combining all the rows that match each MainId from OtherTable into a single field in the result set.
Imagine the data:
MainTable:
1, 'Hi'
2, 'What'
OtherTable:
1, 'Fish', 1
1, 'Horse', 0
2, 'Fish', 0
I want a result set like this:
MainId, Data, Others
1, 'Hi', 'Fish=1,Horse=0'
2, 'What', 'Fish=0'
What is the most elegant way to do this?
(Don't worry about the comma being in front or at the end of the resulting string.)
There is no really elegant way to do this in Sybase. Here is one method, though:
select
mt.MainId,
mt.Data,
Others = stuff((
max(case when seqnum = 1 then ','+Name+'='+cast(status as varchar(255)) else '' end) +
max(case when seqnum = 2 then ','+Name+'='+cast(status as varchar(255)) else '' end) +
max(case when seqnum = 3 then ','+Name+'='+cast(status as varchar(255)) else '' end)
), 1, 1, '')
from MainTable mt
left outer join
(select
ot.*,
row_number() over (partition by MainId order by status desc) as seqnum
from OtherTable ot
) ot
on mt.MainId = ot.MainId
group by
mt.MainId, md.Data
That is, it enumerates the values in the second table. It then does conditional aggregation to get each value, using the stuff() function to handle the extra comma. The above works for the first three values. If you want more, then you need to add more clauses.
Well, here is how I implemented it in Sybase 13.x. This code has the advantage of not being limited to a number of Names.
create proc
as
declare
#MainId int,
#Name varchar(100),
#Status tinyint
create table #OtherTable (
MainId int not null,
CombStatus varchar(250) not null
)
declare OtherCursor cursor for
select
MainId, Name, Status
from
Others
open OtherCursor
fetch OtherCursor into #MainId, #Name, #Status
while (##sqlstatus = 0) begin -- run until there are no more
if exists (select 1 from #OtherTable where MainId = #MainId) begin
update #OtherTable
set CombStatus = CombStatus + ','+#Name+'='+convert(varchar, Status)
where
MainId = #MainId
end else begin
insert into #OtherTable (MainId, CombStatus)
select
MainId = #MainId,
CombStatus = #Name+'='+convert(varchar, Status)
end
fetch OtherCursor into #MainId, #Name, #Status
end
close OtherCursor
select
mt.MainId,
mt.Data,
ot.CombStatus
from
MainTable mt
left join #OtherTable ot
on mt.MainId = ot.MainId
But it does have the disadvantage of using a cursor and a working table, which can - at least with a lot of data - make the whole process slow.