How to replace a column value by its previous value with condition - sql

Row 1
Category
1
New business
2
Adjustment
3
Adjustment
4
Renewal
5
Adjustment
6
Cancellation
The goal is to replace the all the Category 'Adjustment' with the above values.
Output:
Row 1
Category
1
New business
2
New business
3
New business
4
Renewal
5
Renewal
6
Cancellation

I was going to tell you it couldn't be done, but it can. I invented a "row number" column here, but you can substitute your timestamp. This is sqlite3:
CREATE TABLE data (
row integer,
category text
);
INSERT INTO data VALUES
(1, "New business"),
(2, "Adjustment"),
(3, "Adjustment"),
(4, "Renewal"),
(5, "Adjustment"),
(6, "Cancellation");
UPDATE data SET category=(
SELECT category FROM data d2
WHERE d2.category != 'Adjustment'
AND d2.row < data.row
ORDER BY d2.row DESC LIMIT 1
)
WHERE category='Adjustment';
Basically, the subquery selects the non-adjustments row that has the largest row number.

Related

INSERT rows into SQL Server by looping through a column with numbers

Let's say I have a very basic table:
DAY_ID
Value
Inserts
5
8
2
4
3
0
3
3
0
2
4
1
1
8
0
I want to be able to "loop" through the Inserts column, and add that many # of rows.
For each added row, I want DAY_ID to be decreased by 1 and Value to remain the same, Inserts column is irrelevant we can set to 0.
So 2 new rows should be added from DAY_ID = 5 and Value = 8, and 1 new row with DAY_ID = 2 and Value = 4. The final output of the new rows would be:
DAY_ID
Value
Inserts
(5-1)
8
0
(5-2)
8
0
(2-1)
4
0
I haven't tried much in SQL Server, I was able to create a solution in R and Python using arrays, but I'm really hoping I can make something work in SQL Server for this project.
I think this can be done using a loop in SQL.
Looping is generally not the way you solve any problems in SQL - SQL is designed and optimized to work with sets, not one row at a time.
Consider this source table:
CREATE TABLE dbo.src(DAY_ID int, Value int, Inserts int);
INSERT dbo.src VALUES
(5, 8, 2),
(4, 3, 0),
(3, 3, 0),
(2, 4, 1),
(1, 8, 0);
There are many ways to "explode" a set based on a single value. One is to split a set of commas (replicated to the length of the value, less 1).
-- INSERT dbo.src(DAY_ID, Value, Inserts)
SELECT
DAY_ID = DAY_ID - ROW_NUMBER() OVER (PARTITION BY DAY_ID ORDER BY ##SPID),
src.Value,
Inserts = 0
FROM dbo.src
CROSS APPLY STRING_SPLIT(REPLICATE(',', src.Inserts-1), ',') AS v
WHERE src.Inserts > 0;
Output:
DAY_ID
Value
Inserts
1
4
0
4
8
0
3
8
0
Working example in this fiddle.

Dynamic pivoting with Informix SQL

This is my data:
date id value
1/1/2021 a 5
1/1/2021 b 10
1/1/2021 c 7
1/1/2021 d 5
1/1/2021 e 6
1/2/2021 a 4
1/2/2021 b 8
1/2/2021 c 12
1/2/2021 d 3
1/2/2021 e 5
What I want to get is this:
> 1/1/2021 1/2/2021
> a 5 4
> b 10 8
> c 7 12
> d 5 3
> e 6 5
I found soultion how to do this if date column is fixed, but it isn't. It can have other values next time. Also, I found some solutions with dynamic sql, but none of these works with Informix (at least I wasn't able to replicate those result).
How can this be done in Informix?
You can use dynamic SQL — or text manipulation of SQL results — to build a moderately complex SQL statement that returns the data you are after.
The answer below assumes that the table name is Data and that there is a primary key (unique) constraint on the combination of the date and id columns — assumptions that address the questions in my comment:
How many dates might you be working with? You show 2, but is it just 2 or could it be 7, 31, 365, …? Do you always have all 5 of the ID entries a .. e for each date? Is there ever any repetition of the ID values on a given date?
and answers in your response:
I don't know how many dates I might be working with, but probably from 2 to 12, shouldn't be more than 12 dates. ID's will vary too, and some dates might have them all, others don't.
Note: Informix allows you to create this table:
CREATE TABLE data
(
date DATE NOT NULL,
id CHAR(1) NOT NULL,
value INTEGER NOT NULL,
PRIMARY KEY(DATE, id)
);
Many DBMS would require the date column name to be presented as a delimited identifier enclosed in double quotes (and case-sensitive — "date"), or use a proprietary extension such as enclosing the identifier in square brackets ([date]), both in the CREATE TABLE statement and in the subsequent SQL. Informix does not — and manages to distinguish between the letters DATE as column name, data type and function name correctly.
This answer uses what I call TDQD — Test-Driven Query Design.
Relevant dates
SELECT UNIQUE date FROM data
This gives you the dates that will appear as columns. It is probable that you'll filter the data more — such as:
SELECT UNIQUE date
FROM data
WHERE date BETWEEN (TODAY - 7) AND (TODAY - 1)
ORDER BY date
You might format the results to give string usable as a column name (and using a different date range):
SELECT UNIQUE
date AS column_date,
TO_CHAR(date, 'd%Y_%m_%d') AS column_name
FROM data
WHERE date BETWEEN DATE('2021-01-01') AND DATE('2021-01-31')
ORDER BY column_date
This assumes you have set the Informix-specific environment variable DBDATE="Y4MD-" so that DATE values are presented and interpreted like DATETIME YEAR TO DAY values are.
Relevant ID values
SELECT UNIQUE id
FROM data
WHERE date BETWEEN DATE('2021-01-01') AND DATE('2021-01-31')
ORDER BY id
This will give you the list of ID values in column 1 of the final result. However, it isn't crucial to the generated SQL.
Generate SQL for Result Table
SELECT id,
MAX(CASE WHEN date = DATE('2021-01-01') THEN value ELSE NULL END) AS d2021_01_01,
MAX(CASE WHEN date = DATE('2021-01-02') THEN value ELSE NULL END) AS d2021_01_02,
MAX(CASE WHEN date = DATE('2021-01-03') THEN value ELSE NULL END) AS d2021_01_03,
MAX(CASE WHEN date = DATE('2021-01-04') THEN value ELSE NULL END) AS d2021_01_04,
MAX(CASE WHEN date = DATE('2021-01-05') THEN value ELSE NULL END) AS d2021_01_05,
MAX(CASE WHEN date = DATE('2021-01-06') THEN value ELSE NULL END) AS d2021_01_06,
MAX(CASE WHEN date = DATE('2021-01-07') THEN value ELSE NULL END) AS d2021_01_07,
MAX(CASE WHEN date = DATE('2021-01-08') THEN value ELSE NULL END) AS d2021_01_08
FROM data
GROUP BY id
ORDER BY id;
This SQL is built using the column date and column name values from the 'relevant dates' query to generate the MAX(CASE … END) AS dYYYY_MM_DD clauses in the select-list. That has to be done outside SQL — using some program to read the relevant date information and produce the corresponding SQL.
For example, if the output of the last 'relevant dates' query is in the file date.columns, this shell script would generate the requisite SQL:
printf "SELECT id"
while read column_date column_name
do
printf ",\n MAX(CASE WHEN date = DATE('%s') THEN value ELSE NULL END) AS %s" $column_date $column_name
done < date.columns
printf "\n FROM data\n GROUP BY id\n ORDER BY id;\n"
The only difference here is that the column for the date 2021-01-08 is omitted because the value is not selected by the SQL (not present in the date.columns file).
You can use any appropriate tools to run some SQL to generate the required list of dates and give the appropriate values for column_date and column_name and then format the data into an SQL statement as shown.
Sample Data
INSERT INTO data VALUES('2021-01-01', 'a', 5);
INSERT INTO data VALUES('2021-01-01', 'b', 10);
INSERT INTO data VALUES('2021-01-01', 'c', 7);
INSERT INTO data VALUES('2021-01-01', 'd', 5);
INSERT INTO data VALUES('2021-01-01', 'e', 6);
INSERT INTO data VALUES('2021-01-02', 'a', 4);
INSERT INTO data VALUES('2021-01-02', 'b', 8);
INSERT INTO data VALUES('2021-01-02', 'c', 12);
INSERT INTO data VALUES('2021-01-02', 'd', 3);
INSERT INTO data VALUES('2021-01-02', 'e', 5);
INSERT INTO data VALUES('2021-01-03', 'b', 18);
INSERT INTO data VALUES('2021-01-03', 'c', 112);
INSERT INTO data VALUES('2021-01-03', 'd', 13);
INSERT INTO data VALUES('2021-01-03', 'e', 15);
INSERT INTO data VALUES('2021-01-04', 'a', 24);
INSERT INTO data VALUES('2021-01-04', 'c', 212);
INSERT INTO data VALUES('2021-01-04', 'd', 23);
INSERT INTO data VALUES('2021-01-04', 'e', 25);
INSERT INTO data VALUES('2021-01-05', 'a', 34);
INSERT INTO data VALUES('2021-01-05', 'b', 38);
INSERT INTO data VALUES('2021-01-05', 'd', 33);
INSERT INTO data VALUES('2021-01-05', 'e', 35);
INSERT INTO data VALUES('2021-01-06', 'a', 44);
INSERT INTO data VALUES('2021-01-06', 'b', 48);
INSERT INTO data VALUES('2021-01-06', 'c', 412);
INSERT INTO data VALUES('2021-01-06', 'e', 45);
INSERT INTO data VALUES('2021-01-07', 'a', 54);
INSERT INTO data VALUES('2021-01-07', 'c', 512);
INSERT INTO data VALUES('2021-01-07', 'd', 53);
Sample output
Using a Stack Overflow Markdown table:
id
d2021_01_01
d2021_01_02
d2021_01_03
d2021_01_04
d2021_01_05
d2021_01_06
d2021_01_07
d2021_01_08
CHAR(1)
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
a
5
4
24
34
44
54
b
10
8
18
38
48
c
7
12
112
212
412
512
d
5
3
13
23
33
53
e
6
5
15
25
35
45
Tested on a MacBook Pro running macOS 10.14.6 Mojave (yes, antique), using IBM Informix Dynamic Server Version 12.10.FC6 (yes, also antique).

TSQL For Filter Experice From Range multiselect

My Table contain Experience Field which is an integer . and my page contains a check box list like 0-3,3-7,7-9,9-12,12-15,15+ years and i have to filter this from table using select query i have tried between but it is not working when multiple fields selected can any one help
my table structure is like
Name Experience in year
---- ---------
a 1
b 2
c 3
d 5
e 2
f 1
My parameter for database is a varchar string
if we select 0-3years then '0-3'
if we select 3-6years then '3-6'
if we select both then '0-3,3-6'
if we select 0-3years and 9-12years then '0-3,9-12'
Now i am sending Data in these format i dont know it is a good method please show me the better way
First you need a table checkRanges
CREATE TABLE checkRanges
([checkID] int, [name] varchar(8), [low] int, [upper] int);
INSERT INTO checkRanges
([checkID], [name], [low], [upper])
VALUES
(1, '0-3', 0, 2),
(2, '3-6', 3, 5),
(4, '6-9', 6, 8),
(8, '9-12', 9, 11),
(16, '12+', 12, 999)
See how checkID are power of 2?
In your app if user select 3-6 and 9-12 you send 2+8 = 10 to your db. Also would be great if you create your check box using the db info.
In your db you do bitwise comparasion to select the right ranges.
Then perfom the between with each range.
WITH ranges as (
SELECT *
FROM checkRanges
where checkID & 10 > 0
)
SELECT *
FROM users u
inner join ranges r
on u.Experience between r.low and r.upper
See it all together SQL Fiddle Demo
I include more users. You only have to change the clausule where checkID & 10 > 0 to test other combination.
NOTE:
I update the ranges. Change the upper value to value - 1 because between is inclusive and could give duplicate results.
If want use old version you have to replace the betwewen in the join sentence to
u.Experience >= r.low and u.Experience *<* r.upper

Finding the sum of sales when a series of conditions are met

I have a database table that looks like below. It contains a key (id) that identified each transaction. Within each transaction, there may be multiple items that were purchased, thus someome with transact 103 has three different id values because they purchased three different items.
Here is what I am trying to do. For a given set of conditions, I want to total number of items that were purchased (item qty). Let's say that my conditions are that for stores 20 and 35, AND items 7, 12, aned 21, I want to find the total number of purchased items (item qty). When condition x is met, which is the reason for the subquery, sun up the item quantity to get total sales.
Can anyone help?
transac id item_qty store item
101 1 2 20 13
102 2 1 35 21
103 3 3 35 16
103 4 1 35 12
103 5 1 35 7
104 6 1 15 21
104 7 2 20 7
I have the following query which is related to my example but when I utilize such queries on my data it returns a null value each time.
SELECT SUM(Cnt) AS "Sales Count"
FROM (SELECT ti.id, SUM(ti.item_qty) AS Cnt
FROM dbo.vTransactions1 ti
WHERE ti.store IN (20, 35)
AND ti.item IN (7, 12, 21)
GROUP BY ti.id) inner_query1;
One way of doing this would be to group by store and item and then calculating the sum. This way you would be able to add more conditions if required based on valid combinations of (Store,Item). You have grouped by id which is not worth as each row will have unique id so no group will be formed.
For given condition you can write as;
;with CTE as
(
select sum(item_qty) as Cnt,store,item
from test
group by store,item
)
select sum (Cnt) as [Sales Count]
from CTE
where store in (20,35)
and item in (7,12,21))
SQL Fiddle Demo here.
I have no idea why there is a subquery here. Unless I'm missing something, this should work:
select sum (item_qty)
FROM dbo.vTransactions1 ti
WHERE ti.store IN (20, 35)
AND ti.item IN (7, 12, 21)
If you need to add extra conditions in the outer query - you either need to
remove the "Sum" from Sum(Cnt) as the Sum is already done in the sub query (and probably return the ID as well); OR
You need to add GROUP BY condition to the outer query as well.

SQL - suppressing duplicate *adjacent* records

I need to run a Select statement (DB2 SQL) that does not pull adjacent row duplicates based on a certain field. In specific, I am trying to find out when data changes, which is made difficult because it might change back to its original value.
That is to say, I have a table that vaguely resembles the below, sorted by Letter and then by Date:
A, 5, 2009-01-01
A, 12, 2009-02-01
A, 12, 2009-03-01
A, 12, 2009-04-01
A, 9, 2009-05-01
A, 9, 2009-06-01
A, 5, 2009-07-01
And I want to get the results:
A, 5, 2009-01-01
A, 12, 2009-02-01
A, 9, 2009-05-01
A, 5, 2009-07-01
discarding adjacent duplicates but keeping the last row (despite it having the same number as the first row). The obvious:
Select Letter, Number, Min(Update_Date) from Table group by Letter, Number
does not work -- it doesn't include the last row.
Edit: As there seems to have been some confusion, I have clarified the month column into a date column. It was meant as a human-parseable short form, not as actual valid data.
Edit: The last row is not important BECAUSE it is the last row, but because it has a "new value" that is also an "old value". Grouping by NUMBER would wrap it in with the first row; it needs to remain a separate entity.
Depending on which DB2 you're on, there are analytic functions which can make this problem easy to solve. An example in Oracle is below, but the select syntax appears to be pretty similar.
create table t1 (c1 char, c2 number, c3 date);
insert into t1 VALUES ('A', 5, DATE '2009-01-01');
insert into t1 VALUES ('A', 12, DATE '2009-02-01');
insert into t1 VALUES ('A', 12, DATE '2009-03-01');
insert into t1 VALUES ('A', 12, DATE '2009-04-01');
insert into t1 VALUES ('A', 9, DATE '2009-05-01');
insert into t1 VALUES ('A', 9, DATE '2009-06-01');
insert into t1 VALUES ('A', 5, DATE '2009-07-01');
SQL> l
1 SELECT C1, C2, C3
2 FROM (SELECT C1, C2, C3,
3 LAG(C2) OVER (PARTITION BY C1 ORDER BY C3) AS PRIOR_C2,
4 LEAD(C2) OVER (PARTITION BY C1 ORDER BY C3) AS NEXT_C2
5 FROM T1
6 )
7 WHERE C2 <> PRIOR_C2
8 OR PRIOR_C2 IS NULL -- to pick up the first value
9 ORDER BY C1, C3
SQL> /
C C2 C3
- ---------- -------------------
A 5 2009-01-01 00:00:00
A 12 2009-02-01 00:00:00
A 9 2009-05-01 00:00:00
A 5 2009-07-01 00:00:00
This is not possible with set based commands (i.e. group by and such).
You may be able to do this by using cursors.
Personally, I would get the data into my client application and do the filtering there.
The first thing you'd have to do is identify the sequence within which you wish to view/consider the the data. Values of "Jan, Feb, Mar" don't help, because the data's not in alphabetical order. And what happens when you flip from Dec to Jan? Step 1: identify a sequence that uniquely defines each row with regards to your problem.
Next, you have to be able to compare item #x with item #x-1, to see if it has changed. If changed, include; if not changed, exclude. Trivial when using procedural code loops (cursors in SQL), but would you want to use those? They tend not to perform too well.
One SQL-based way to do this is to join the table on itself, with the join clause being "MyTable.SequenceVal = MyTable.SequenceVal - 1". Throw in a comparison, make sure you don't toss the very first row of the set (where there is no x-1), and you're done. Note that performance may suck if the "SequenceVal" is not indexed.
Using an "EXCEPT" clause is one way to do it. See below for the solution. I've included all of my test steps here. First, I created a session table (this will go away after I disconnect from my database).
CREATE TABLE session.sample (
letter CHAR(1),
number INT,
update_date DATE
);
Then I imported your sample data:
IMPORT FROM sample.csv OF DEL INSERT INTO session.sample;
Verified that your sample information is in the database:
SELECT * FROM session.sample;
LETTER NUMBER UPDATE_DATE
------ ----------- -----------
A 5 01/01/2009
A 12 02/01/2009
A 12 03/01/2009
A 12 04/01/2009
A 9 05/01/2009
A 9 06/01/2009
A 5 07/01/2009
7 record(s) selected.
I wrote this with an EXCEPT clause, and used the "WITH" to try to make it clearer. Basically, I'm trying to select all rows that have a previous date entry. Then, I exclude all of those rows from a select on the whole table.
WITH rows_with_previous AS (
SELECT s.*
FROM session.sample s
JOIN session.sample s2
ON s.letter = s2.letter
AND s.number = s2.number
AND s.update_date = s2.update_date - 1 MONTH
)
SELECT *
FROM session.sample
EXCEPT ALL
SELECT *
FROM rows_with_previous;
Here is the result:
LETTER NUMBER UPDATE_DATE
------ ----------- -----------
A 5 01/01/2009
A 12 04/01/2009
A 9 06/01/2009
A 5 07/01/2009
4 record(s) selected.