SQL Insert multiple value to one key staff from two tables (try to improve) - sql

Basically, I want to insert multiple values to a signal staff form two tables, Value_table and Staff_table, which look like this:
Staff_table
| staff_num | staff_name | staff_role |
+---------------------+------------------+------------------+
| 1 | Bill | 1 |
| 2 | James | 1 |
| 3 | Gina | 2 |
| 4 | Tim | 3 |
Value_table
| value | value_name | value_state |
+------------------+------------------+------------------+
| 123 | Food | 0 |
| 476 | Drink | 1 |
| 656 | Dinner | 1 |
| 77 | Phone | 1 |
And the result should look like this:
| staff_num | value |
+---------------------+------------------+
| 1 | 123 |
| 1 | 476 |
| 1 | 656 |
| 1 | 77 |
| 2 | 123 |
| 2 | 476 |
| 2 | 656 |
| 2 | 77 |
.
.
.
.
I find a SQL way to do it which is the following code.
INSERT INTO Result_table
SELECT DISTINCT a.staff_code, c.func_num
FROM Staff_table a
JOIN Value_table c ON c.value BETWEEN 0 AND 656;
It works but I don't think the last line
JOIN Value_table c ON c.value BETWEEN 0 AND 656;
is a good way to select all the value from the table.
Is there a better way to do it in SQL?

Alternatively you may use CROSS JOIN as
INSERT INTO Result_table
SELECT distinct a.staff_code,c.func_num
FROM Staff_table a
CROSS JOIN Value_table c
Since, there's no need to restrict the value column which is already between 0 and 656 depending on the sample data.
Edit : If you want to restrict add a WHERE condition below as
INSERT INTO Result_table
SELECT distinct a.staff_code,c.func_num
FROM Staff_table a
CROSS JOIN Value_table c
WHERE c.value BETWEEN 77 and 123

Related

Replace nulls of a column with column value from another table

I have data flowing from two tables, table A and table B. I'm doing an inner join on a common column from both the tables and creating two more new columns based on different conditions. Below is a sample dataset:
Table A
| Id | StartDate |
|-----|------------|
| 119 | 01-01-2018 |
| 120 | 01-02-2019 |
| 121 | 03-05-2018 |
| 123 | 05-08-2021 |
TABLE B
| Id | CodeId | Code | RedemptionDate |
|-----|--------|------|----------------|
| 119 | 1 | abc | null |
| 119 | 2 | abc | null |
| 119 | 3 | def | null |
| 119 | 4 | def | 2/3/2019 |
| 120 | 5 | ghi | 04/7/2018 |
| 120 | 6 | ghi | 4/5/2018 |
| 121 | 7 | jkl | null |
| 121 | 8 | jkl | 4/4/2019 |
| 121 | 9 | mno | 3/18/2020 |
| 123 | 10 | pqr | null |
What I'm basically doing is joining the tables on column 'Id' when StartDate>2018 and create two new columns - 'unlock' by counting CodeId when RedemptionDate is null and 'Redeem' by counting CodeId when RedmeptionDate is not null. Below is the SQL query:
WITH cte1 AS (
SELECT a.id, COUNT(b.CodeId) AS 'Unlock'
FROM TableA AS a
JOIN TableB AS b ON a.Id=b.Id
WHERE YEAR(a.StartDate) >= 2018 AND b.RedemptionDate IS NULL
GROUP BY a.id
), cte2 AS (
SELECT a.id, COUNT(b.CodeId) AS 'Redeem'
FROM TableA AS a
JOIN TableB AS b ON a.Id=b.Id
WHERE YEAR(a.StartDate) >= 2018 AND b.RedemptionDate IS NOT NULL
GROUP BY a.id
)
SELECT cte1.Id, cte1.Unlocked, cte2.Redeemed
FROM cte1
FULL OUTER JOIN cte2 ON cte1.Id = cte2.Id
If I break down the output of this query, result from cte1 will look like below:
| Id | Unlock |
|-----|--------|
| 119 | 3 |
| 121 | 1 |
| 123 | 1 |
And from cte2 will look like below:
| Id | Redeem |
|-----|--------|
| 119 | 1 |
| 120 | 2 |
| 121 | 2 |
The last select query will produce the following result:
| Id | Unlock | Redeem |
|------|--------|--------|
| 119 | 3 | 1 |
| null | null | 2 |
| 121 | 1 | 2 |
| 123 | 1 | null |
How can I replace the null value from Id with values from 'b.Id'? If I try coalesce or a case statement, they create new columns. I don't want to create additional columns, rather replace the null values from the column values coming from another table.
My final output should like:
| Id | Unlock | Redeem |
|-----|--------|--------|
| 119 | 3 | 1 |
| 120 | null | 2 |
| 121 | 1 | 2 |
| 123 | 1 | null |
If I'm following correctly, you can use apply with aggregation:
select a.*, b.*
from a cross apply
(select count(RedemptionDate) as num_redeemed,
count(*) - count(RedemptionDate) as num_unlock
from b
where b.id = a.id
) b;
However, the answer to your question is to use coalesce(cte1.id, cte2.id) as id.

Calculate a column value backwards over a series of previous rows/RECURSIVE/CONNECTED BY

need your help. I guess/hope there is a function for that. I found "CONNECT DBY" and "WITH RECURSIVE AS ..." but it doesn't seem to solve my problem.
GIVEN TABLES:
Table A
+------+------------+----------+
| id | prev_id | date |
+------------------------------+
| 1 | | 20200101 |
| 23 | 1 | 20200104 |
| 34 | 23 | 20200112 |
| 41 | 34 | 20200130 |
+------------------------------+
Table B
+------+-----------+
| ref_id | key |
+------------------+
| 41 | abc |
+------------------+
(points always to the lates entry in table "A". Update, no history)
Join Statement:
SELECT
id, prev_id, key, date
FROM A
LEFT OUTER JOIN B ON B.ref_id = A.id
GIVEN psql result set:
+------+------------+----------+-----------+
| id | prev_id | key | date |
+------------------------------+-----------+
| 1 | | | 20200101 |
| 23 | 1 | | 20200104 |
| 34 | 23 | | 20200112 |
| 41 | 34 | abc | 20200130 |
+------------------------------+-----------+
DESIRED output:
+------+------------+----------+-----------+
| id | prev_id | key | date |
+------------------------------+-----------+
| 1 | | abc | 20200101 |
| 23 | 1 | abc | 20200104 |
| 34 | 23 | abc | 20200112 |
| 41 | 34 | abc | 20200130 |
+------------------------------+-----------+
The rows of the result set are connected by columns 'id' and 'prev_id'.
I want to calculate the "key" column in a reasonable time.
Keep in mind, this is a very simplified example. Normally there are a lot of more rows and different keys and id's
I understand that you want to bring the hierarchy of each row in tableb. Here is one approach using a recursive query:
with recursive cte as (
select a.id, a.prev_id, a.date, b.key
from tablea a
inner join tableb b on b.ref_id = a.id
union all
select a.id, a.prev_id, a.date, c.key
from cte c
inner join tablea a on a.id = c.prev_id
)
select * from cte

How to fill forward time series data in Postgres

I am looking to join three tables together and fill forward null values on the resulting table.
Three tables:
Table 1 (raw.fb_historical_data) - this is the main table on which I would like to join the other two on to. Each row of this table is related to one or more rows in the other two tables through a combination of columns id, clk and timestamp (mkt_id and row_id in the other tables).
+---------------------+-----+-----+--------------+
| timestamp | clk | id | some_columns |
+---------------------+-----+-----+--------------+
| 2016-06-19 06:11:13 | 123 | 126 | a |
| 2016-06-19 06:16:13 | 124 | 127 | b |
| 2016-06-19 06:21:13 | 234 | 126 | c |
| 2016-06-19 06:41:13 | 456 | 127 | d |
| ... | ... | ... | ... |
+---------------------+-----+-----+--------------+
Table 2 (raw.fb_runner_changes) - this table essentially gives price changes for a wide range of different markets
+---------------------+--------+--------+-------+
| timestamp | row_id | mkt_id | price |
+---------------------+--------+--------+-------+
| 2016-06-19 06:11:13 | 123 | 126 | 1 |
| 2016-06-19 06:21:13 | 123 | 126 | 2 |
| 2016-06-19 06:41:13 | 123 | 126 | 3 |
| 2016-06-06 18:54:06 | 124 | 127 | 1 |
| 2016-06-06 18:56:06 | 124 | 127 | 2 |
| 2016-06-06 18:57:06 | 124 | 127 | 3 |
| ... | ... | ... | ... |
+---------------------+--------+--------+-------+
Table 3 (raw.fb_runners) - a table with extra information about market changes that I would like to join
+---------------------+--------+--------+---------------+
| timestamp | row_id | mkt_id | other_columns |
+---------------------+--------+--------+---------------+
| 2016-06-19 06:15:13 | 234 | 126 | ab |
| 2016-06-19 06:31:13 | 234 | 126 | cd |
| 2016-06-19 06:56:13 | 234 | 126 | ef |
| 2016-06-06 18:54:06 | 456 | 127 | gh |
| 2016-06-06 18:56:06 | 456 | 127 | jk |
| 2016-06-06 18:57:06 | 456 | 127 | lm |
| ... | ... | ... | ... |
+---------------------+--------+--------+---------------+
Essentially what I want to do is fill NULL information forward (ordered by timestamp) while grouping by market id.
So far, I have tried to join the tables together using
SELECT *
FROM raw.fb_historical_data AS h
LEFT JOIN raw.fb_runner_changes AS rc
ON rc.row_id = h.clk
AND rc.timestamp = h.timestamp
AND rc.mkt_id = h.id
LEFT JOIN raw.fb_runners AS r
ON r.row_id = h.clk
AND r.timestamp = h.timestamp
AND r.mkt_id = h.id
Which has worked as intended, though now there are nulls in the resulting dataset which i'd like to fill in with the last available value for that market.
With some of the other SQL dialects, fill forward could be done using the window function last_value in combination with the instruction ignore nulls.
Since this is not supported in PostgreSQL (check the note at the bottom of this page), we are using a 2 steps work-around.
select ts, val, val_seq, min(val) over (partition by val_seq) val_fill_fw
from (select ts, val, count(val) over(order by ts) as val_seq
from t
) t
-
+----+----------+---------+-------------+
| ts | val | val_seq | val_fill_fw |
+----+----------+---------+-------------+
| 1 | (null) | 0 | (null) |
| 2 | (null) | 0 | (null) |
| 3 | hello | 1 | hello |
| 4 | (null) | 1 | hello |
| 5 | (null) | 1 | hello |
| 6 | darkness | 2 | darkness |
| 7 | my | 3 | my |
| 8 | (null) | 3 | my |
| 9 | old | 4 | old |
| 10 | (null) | 4 | old |
| 11 | (null) | 4 | old |
| 12 | (null) | 4 | old |
| 13 | friend | 5 | friend |
| 14 | (null) | 5 | friend |
+----+----------+---------+-------------+
SQL Fiddle
This seems to correctly do 'forward fill' in postgres. However I am a postgres newbie so I would appreciate feedback if it's wrong.
DROP TABLE IF EXISTS example;
create temporary table example(id int, str text, val integer);
insert into example values
(1, 'a', null),
(1, null, 1),
(2, 'b', 2),
(2,null ,null );
select * from example
select id, (case
when str is null
then lag(str,1) over (order by id)
else str
end) as str,
(case
when val is null
then lag(val,1) over (order by id)
else val
end) as val
from example

SQL - aggregation with column value as column name

For a table like below need to do an aggregation such that for each unique field in one column, need to find the count of occurrences of a discrete value in another column
input table is:
id model datetime driver distance
---|-----|------------|--------|---------
1 | S | 04/03/2009 | john | 399
2 | X | 04/03/2009 | juliet | 244
3 | 3 | 04/03/2009 | borat | 555
4 | 3 | 03/03/2009 | john | 300
5 | X | 03/03/2009 | juliet | 200
6 | X | 03/03/2009 | borat | 500
7 | S | 24/12/2008 | borat | 600
8 | X | 01/01/2009 | borat | 700
Output required
model john juliet | borat
-----|--------|-------|------
S | 1 | 0 | 1
X | 0 | 2 | 2
3 | 1 | 0 | 1
one potential way to do is to group by model with an aggregation like
SUM (CASE WHEN driver = 'value' THEN 1 ELSE 0 END) AS value for each discrete value of driver column. But the challenge is sometimes the number of discrete values is too many ( around 50 in my case) or in some cases do not even know all possible discrete values - I was wondering if there is an alternate way to do this.
The aggregation part need a litle more work.
Here the details:
Need calculate first what are all the combinations
Then use LEFT JOIN to get which combination doesnt have data.
DEMO
WITH "allDrivers" as (
SELECT DISTINCT "driver"
FROM Table1
),
"allModels" as (
SELECT DISTINCT "model"
FROM Table1
),
"source" as (
SELECT d."driver", m."model"
FROM "allDrivers" d
CROSS JOIN "allModels" m
)
SELECT s."model", s."driver", COUNT(t."datetime")
FROM "source" s
LEFT JOIN table1 t
ON s."model" = t."model"
AND s."driver" = t."driver"
GROUP BY s."model", s."driver"
OUTPUT
| model | driver | count |
|-------|--------|-------|
| 3 | borat | 1 |
| 3 | john | 1 |
| 3 | juliet | 0 |
| S | borat | 1 |
| S | john | 1 |
| S | juliet | 0 |
| X | borat | 2 |
| X | john | 0 |
| X | juliet | 2 |
Then you can do the dynamic pivot

MS Access SQL query from 3 tables

I have 3 tables shown below in MS Access 2010:
Table: devices
id | device_id | Company | Version | Revision |
-----------------------------------------------
1 | dev_a | Almaras | 1.5.1 | 0.2A |
2 | dev_b | Enigma | 1.5.1 | 0.2A |
3 | dev_c | Almaras | 1.5.1 | 0.2C |
*Field: device_id is Primary Key Unique String
*Field ID is just an auto-number column
Table: activities
id | act_id | act_date | act_type | act_note |
------------------------------------------------
1 | dev_a | 07/22/2013 | usb_axc | ok |
2 | dev_a | 07/23/2013 | usb_axe | ok | (LAST ROW for dev_a)
3 | dev_c | 07/22/2013 | usb_axc | ok | (LAST ROW for dev_c)
4 | dev_b | 07/21/2013 | usb_axc | ok | (LAST ROW for dev_b)
*Field: act_id contains device_id; NOT UNIQUE
*Field ID is just an auto-number column
Table: matrix
id | mat_id | tc | ts | bat | cycles |
-----------------------------------------
1 | dev_a | 2811 | 10 | 99 | 200 |
2 | dev_a | 2911 | 10 | 97 | 400 |
3 | dev_a | 3007 | 10 | 94 | 600 |
4 | dev_a | 3210 | 10 | 92 | 800 | (LAST ROW for dev_d)
5 | dev_b | 1100 | 5 | 98 | 100 |
6 | dev_b | 1300 | 8 | 93 | 200 |
7 | dev_b | 1411 | 11 | 90 | 300 | (LAST ROW for dev_b)
8 | dev_c | 4000 | 27 | 77 | 478 | (LAST ROW for dev_c)
*Field: mat_id contains device_id; NOT UNIQUE
*Field ID is just an auto-number column
Is there any way to query tables to get results as shown below (each device from devices and only last row added [see example output table] from each of the other two tables):
Query Results:
device_id | Company | act_date | act_type | bat | cycles |
------------------------------------------------------------
device_a | Almaras | 07/23/2013 | usb_axe | 92 | 800 |
device_b | Enigma | 07/21/2013 | usb_axc | 90 | 300 |
device_c | Almaras | 07/22/2013 | usb_axc | 77 | 478 |
Any ideas? Thank you in advance for reading and helping me out :)
I think is what you want,
SELECT a.device_id, a.Company,
b.act_date, b.act_type,
c.bat, c.cycles
FROM ((((devices AS a
INNER JOIN activities AS b
ON a.device_id = b.act_id)
INNER JOIN matrix AS c
ON a.device_id = c.mat_id)
INNER JOIN
(
SELECT act_id, MAX(act_date) AS max_date
FROM activities
GROUP BY act_id
) AS d ON b.act_id = d.act_id AND b.act_date = d.max_date)
INNER JOIN
(
SELECT mat_id, MAX(tc) AS max_tc
FROM matrix
GROUP BY mat_id
) AS e ON c.mat_id = e.mat_id AND c.tc = e.max_tc)
The subqueries: d and e separately gets the latest row for every act_id.
Try
SELECT devices.device_id, devices.Company, activities.act_data, activities.act_type, matrix.bat, matrix.cycles
FROM devices
LEFT JOIN activities
ON devices.device_id = activities.act_id
LEFT JOIN matrix
ON devices.device_id = matrix.mat_id;
What do you consider the "last" row in Matrix?
You need to do something like
WHERE act_date in (SELECT max(a.act_date) from activities a where a.mat_id=d.device_id GROUP BY a.mat_id)
and something similar for the join to matrix.