SQL Timeline Query - sql

read already some post but was not able to find a solution yet.
I gota table which looks like this:
and I would like to transform this data, so that I got a line (or row) per ID and an entry per date which displays the Status. The value column does not change its value for the corresponding id.
or
I am currently not able to do it. Even without the value row/line.
CREATE TABLE test (
id INT,
date1 text,
status1 INT,
value1 INT
);
INSERT INTO test VALUES (1, '01.01.2022', 1, 60);
INSERT INTO test VALUES (2, '01.01.2022', 1, 30);
INSERT INTO test VALUES (3, '01.01.2022', 7, 90);
INSERT INTO test VALUES (1, '02.01.2022', 7, 60);
INSERT INTO test VALUES (2, '02.01.2022', 7, 30);
INSERT INTO test VALUES (3, '02.01.2022', 3, 90);
INSERT INTO test VALUES (1, '03.01.2022', 7, 60);
INSERT INTO test VALUES (2, '03.01.2022', 5, 30);
INSERT INTO test VALUES (3, '03.01.2022', 7, 90);
Based on your suggestions I tried:
SELECT *
FROM
(
SELECT id, value1
FROM test
) AS SourceTable
PIVOT(AVG(status1) FOR date1 IN(select DISTINCT date1
from test)) AS PivotTable;
But I can not find my error.

Schema (MySQL v8.0)
CREATE TABLE test (
id INT,
date text,
status INT,
value INT
);
INSERT INTO test VALUES (1, '01.01.2022', 1, 60);
INSERT INTO test VALUES (2, '01.01.2022', 1, 30);
INSERT INTO test VALUES (3, '01.01.2022', 7, 90);
INSERT INTO test VALUES (1, '02.01.2022', 7, 60);
INSERT INTO test VALUES (2, '02.01.2022', 7, 30);
INSERT INTO test VALUES (3, '02.01.2022', 3, 90);
INSERT INTO test VALUES (1, '03.01.2022', 7, 60);
INSERT INTO test VALUES (2, '03.01.2022', 5, 30);
INSERT INTO test VALUES (3, '03.01.2022', 7, 90);
Query #1
SELECT
ID,
MAX(VALUE) AS VALUE,
sum(CASE WHEN date = '01.01.2022' THEN status ELSE 0 END) AS '01.01.2022',
sum(CASE WHEN date = '02.01.2022' THEN status ELSE 0 END) AS '02.01.2022',
sum(CASE WHEN date = '03.01.2022' THEN status ELSE 0 END) AS '03.01.2022'
FROM test
GROUP BY ID;
ID
VALUE
01.01.2022
02.01.2022
03.01.2022
1
60
1
7
7
2
30
1
7
5
3
90
7
3
7
View on DB Fiddle

Related

summing by rows sql

I attempted to do it using the analytical function, but it appears that I did so improperly...
How can I receive the output from the table I've been given?
CREATE TABLE rides (
ride_id INT,
driver_id INT,
ride_in_kms INT,
ride_fare FLOAT,
ride_date DATE
);
INSERT INTO rides VALUES (1, 1, 3, 4.45, "2016-05-16");
INSERT INTO rides VALUES (2, 1, 4, 8.46, "2016-05-16");
INSERT INTO rides VALUES (3, 2, 6, 11.9, "2016-05-16");
INSERT INTO rides VALUES (4, 3, 3, 6.76, "2016-05-16");
INSERT INTO rides VALUES (5, 2, 6, 13.55, "2016-05-16");
INSERT INTO rides VALUES (6, 4, 3, 4.91, "2016-05-20");
INSERT INTO rides VALUES (7, 1, 7, 16.77, "2016-05-20");
INSERT INTO rides VALUES (8, 3, 9, 16.18, "2016-05-20");
INSERT INTO rides VALUES (9, 2, 3, 6.07, "2016-05-20");
INSERT INTO rides VALUES (10, 4, 4, 6.25, "2016-05-20");
Output result
Thanks in advance
The general gist is to use an expression within the sum() to operate on the correct rows:
select
driver_id,
sum(case when ride_date = "2016-05-16" then ride_in_kms else 0 end) `KMS_MAY_16`,
sum(case when ride_date = "2016-05-20" then ride_in_kms else 0 end) `KMS_MAY_20`
from
group by driver_id;
The particular syntax available, and how to express the column label depends on what database you are using.

SQL search full time or partial time

I am trying to create a stored procedure that can filter on a full or partial time. It should only filter on hours or minutes (not seconds) or both hours and minutes.
Using the sample data below:
#StartTimeFilter = '09:15' --> should return record #1
#StartTimeFilter = '10' --> should return records 2, 3, 10
#StartTimeFilter = '5' --> should return records 1, 2, 5, 6, 7, 8, 9
#StartTimeFilter = '45' --> should return records 5, 6
#StartTimeFilter = '13:45' --> should return record #6
#StartTimeFilter = '11:' --> should return records 4, 5
Code:
CREATE TABLE test
(
id INT,
startTime DateTime
);
INSERT INTO test (id, startTime) VALUES (1, '2021-10-25 09:15:00');
INSERT INTO test (id, startTime) VALUES (2, '2021-10-25 10:15:00');
INSERT INTO test (id, startTime) VALUES (3, '2021-10-25 10:30:00');
INSERT INTO test (id, startTime) VALUES (4, '2021-10-25 11:30:00');
INSERT INTO test (id, startTime) VALUES (5, '2021-10-25 11:45:00');
INSERT INTO test (id, startTime) VALUES (6, '2021-10-25 13:45:00');
INSERT INTO test (id, startTime) VALUES (7, '2021-10-25 14:50:00');
INSERT INTO test (id, startTime) VALUES (8, '2021-10-25 15:51:00');
INSERT INTO test (id, startTime) VALUES (9, '2021-10-25 15:58:00');
INSERT INTO test (id, startTime) VALUES (10,'2021-10-25 16:10:00');
It looks like you need a simple like string comparision:
declare #StartTimeFilter varchar(10)='5'
select *
from test
where Convert(varchar(5),starttime,114) like Concat('%',#StartTimeFilter,'%')

Can I improve this query for use in large tables?

How can I improve this query for use in large tables....?
I use a table ('DataValues') to store a collection of values ('Value') for collections ('Visit_id') ie it records certain values for each visit.
I use a table ('MatchItems') to store dynamic match sets 'MatchSet' of values ('Value'), sets can contain any number of values. The table also has a IsNeg field to indicate if the match should require a value to be not present in the visit collection.
This allows me to dynamically match visits that conform to certain criteria such as
Must contain values A, B and C and NOT D OR C and B AND NOT A.
ie (Value = A and Value = B and Value = C and Value /= D)
or (Value = C and Value = B and Value /= A)
I have a query that delivers a reasonable solution fiddle:
CREATE TABLE DataValues (
id NUMBER(5) CONSTRAINT DataValues_pk PRIMARY KEY,
Visit_id Number(5) ,
Value varchar(5)
);
INSERT INTO DataValues VALUES (1, 1, 'M');
INSERT INTO DataValues VALUES (2, 1, 'I');
INSERT INTO DataValues VALUES (3, 1, 'C');
INSERT INTO DataValues VALUES (4, 1, 'K');
INSERT INTO DataValues VALUES (5, 1, 'E');
INSERT INTO DataValues VALUES (6, 1, 'Y');
INSERT INTO DataValues VALUES (7, 2, 'M');
INSERT INTO DataValues VALUES (8, 2, 'O');
INSERT INTO DataValues VALUES (9, 2, 'U');
INSERT INTO DataValues VALUES (10, 2, 'S');
INSERT INTO DataValues VALUES (11, 2, 'E');
INSERT INTO DataValues VALUES (12, 3, 'C');
INSERT INTO DataValues VALUES (13, 3, 'A');
INSERT INTO DataValues VALUES (14, 3, 'T');
INSERT INTO DataValues VALUES (15, 4, 'S');
INSERT INTO DataValues VALUES (16, 4, 'A');
INSERT INTO DataValues VALUES (17, 4, 'T');
INSERT INTO DataValues VALUES (18, 5, 'M');
INSERT INTO DataValues VALUES (19, 5, 'A');
INSERT INTO DataValues VALUES (20, 5, 'T');
CREATE TABLE MatchItems (
id NUMBER(5) CONSTRAINT MatchItems_pk PRIMARY KEY,
MatchSet Number(5),
Value VARCHAR(5),
IsNeg NUMBER(1) NOT NULL CHECK (IsNeg in (0,1))
);
INSERT INTO MatchItems VALUES (1, 1, 'M', 0);
INSERT INTO MatchItems VALUES (2, 1, 'I', 0);
INSERT INTO MatchItems VALUES (3, 1, 'C', 0);
INSERT INTO MatchItems VALUES (4, 1, 'K', 0);
INSERT INTO MatchItems VALUES (5, 1, 'E', 0);
INSERT INTO MatchItems VALUES (6, 1, 'Y', 0);
INSERT INTO MatchItems VALUES (7, 2, 'C', 0);
INSERT INTO MatchItems VALUES (8, 2, 'A', 0);
INSERT INTO MatchItems VALUES (9, 3, 'A', 0);
INSERT INTO MatchItems VALUES (10, 3, 'T', 0);
INSERT INTO MatchItems VALUES (11, 4, 'S', 1);
INSERT INTO MatchItems VALUES (12, 4, 'A', 0);
INSERT INTO MatchItems VALUES (13, 4, 'K', 1);
INSERT INTO MatchItems VALUES (14, 5, 'A', 0);
INSERT INTO MatchItems VALUES (15, 5, 'T', 0);
SELECT
MatchItems.MatchSet,
DataValues.Visit_id,
GpMatchItems.Count TgtCount,
Count(MatchItems.Id),
sum(MatchItems.IsNeg)
FROM DataValues
LEFT JOIN MatchItems ON MatchItems.Value = DataValues.Value
--AND MatchItems.MatchSet = 4
LEFT JOIN (SELECT
MatchItems.MatchSet,
count(*) Count
FROM MatchItems
WHERE
MatchItems.IsNeg = 0
GROUP BY
MatchItems.MatchSet) GpMatchItems ON GpMatchItems.MatchSet = MatchItems.MatchSet
HAVING
Count(MatchItems.Id) = GpMatchItems.Count
AND sum(MatchItems.IsNeg) = 0
GROUP BY
MatchItems.MatchSet,
DataValues.Visit_id,
GpMatchItems.Count
How can I improve the performance of this query where the DataValues table contains 100m records, and MatchItems may include a collection of 50 sets each of 2 - 20 values?
You can try this version using Analytic functions and see if it performs any better. This query removes the subquery GpMatchItems that you are joining with.
SELECT DISTINCT matchset,
visit_id,
tgtcount,
match_visit_count,
isneg_sum
FROM (SELECT MatchItems.MatchSet,
DataValues.Visit_id,
COUNT (DISTINCT CASE MatchItems.IsNeg WHEN 0 THEN MatchItems.id ELSE NULL END)
OVER (PARTITION BY MatchItems.MatchSet)
AS tgtcount,
COUNT (*) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS match_visit_count,
SUM (MatchItems.IsNeg) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS isneg_sum
FROM DataValues LEFT JOIN MatchItems ON MatchItems.VALUE = DataValues.VALUE)
WHERE tgtcount = match_visit_count AND isneg_sum = 0;
I have adjusted EJ's suggestion to include a LEFT JOIN to collect the tgtCount to identify the total number of good matches required in each MatchSet:
SELECT DISTINCT matchset,
visit_id,
tgtcount,
match_visit_count,
isneg_sum
GpMatchItems.count tgtCount
FROM
COUNT (*) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS match_visit_count,
SUM (MatchItems.IsNeg) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS isneg_sum
FROM DataValues
LEFT JOIN MatchItems ON MatchItems.VALUE = DataValues.VALUE)
LEFT JOIN ( SELECT
MatchItems.MatchSet,
count(*) Count
FROM MatchItems
WHERE MatchItems.IsNeg = 0
GROUP BY
MatchItems.MatchSet) GpMatchItems
ON GpMatchItems.MatchSet = MatchItems.MatchSet
)
WHERE
tgtcount = match_visit_count
AND isneg_sum = 0;

Sample observations per group without replacement in SQL

Using the provided table I would like to sample let's say 2 users per day so that users assigned to the two days are different. Of course the problem I have is more sophisticated, but this simple example gives the idea.
drop table if exists test;
create table test (
user_id int,
day_of_week int);
insert into test values (1, 1);
insert into test values (1, 2);
insert into test values (2, 1);
insert into test values (2, 2);
insert into test values (3, 1);
insert into test values (3, 2);
insert into test values (4, 1);
insert into test values (4, 2);
insert into test values (5, 1);
insert into test values (5, 2);
insert into test values (6, 1);
insert into test values (6, 2);
The expected results would look like this:
create table results (
user_id int,
day_of_week int);
insert into results values (1, 1);
insert into results values (2, 1);
insert into results values (3, 2);
insert into results values (6, 2);
You can use window functions. Here is an example . . . although the details do depend on your database (functions for random numbers vary by database):
select t.*
from (select t.*, row_number() over (partition by day_of_week order by random()) as seqnum
from test t
) t
where seqnum <= 2;

How to insert multiple row values into SQL

I have the following 3 tables:
CREATE TABLE Tests (
Test_ID INT,
TestName VARCHAR(50));
INSERT INTO Tests VALUES (1, 'SQL Test');
INSERT INTO Tests VALUES (2, 'C# Test');
INSERT INTO Tests VALUES (3, 'Java Test');
CREATE TABLE Users (
[User_ID] INT,
UserName VARCHAR(50));
INSERT INTO Users VALUES (1, 'Joe');
INSERT INTO Users VALUES (2, 'Jack');
INSERT INTO Users VALUES (3, 'Jane');
CREATE TABLE UserTests (
ID INT,
[User_ID] INT,
Test_ID INT,
Completed INT);
INSERT INTO UserTests VALUES (1, 1, 1, 0);
INSERT INTO UserTests VALUES (2, 1, 2, 1);
INSERT INTO UserTests VALUES (3, 1, 3, 1);
INSERT INTO UserTests VALUES (4, 2, 1, 0);
INSERT INTO UserTests VALUES (5, 2, 2, 0);
INSERT INTO UserTests VALUES (6, 2, 3, 0);
INSERT INTO UserTests VALUES (7, 3, 1, 1);
INSERT INTO UserTests VALUES (8, 3, 2, 1);
INSERT INTO UserTests VALUES (9, 3, 3, 1);
I would like to create some rule/trigger so that when a new user gets added to the Users table, an entry for each Test and that user's Id will get added to the UserTests table.
Something like this if the new user ID is 5:
INSERT dbo.UserTest
(USER_ID, TEST_ID, Completed)
VALUES
(5, SELECT TEST_ID FROM Tests, 0)
That syntax is of course wrong but to give an idea of what I expect to happen.
So I expect that statement to add these values to the UserTests table:
User ID| Test ID| Completed
5 | 1 | 0
5 | 2 | 0
5 | 3 | 0
You can use after trigger for user table.
Create Trigger tr_user on Users
After Insert
AS Begin
INSERT UserTest(USER_ID, TEST_ID, Completed)
Select I.USER_ID, t.TEST_ID, 0
From Inserted I, Tests t
END
Here's a SQL Fiddle that finds missing records and inserts them.
SQL Fiddle
The SELECT:
select u.user_id, t.test_id, 0 as Completed
from users u
cross join tests t
where not exists (
select 1
from usertests ut
where ut.user_id = u.user_id and ut.test_id = t.test_id)
Adding insert into UserTests (User_Id, Test_Id, Completed) before the select will insert these records.
You can add a user id on to the where clause to do it for a single user if required. It is re-runnable so it won't re-insert test ids for a user that already has them, but will add new ones if new tests are introduced.