Postgres query all results from one table blended with conditional data from another table - sql

I have 2 SQL tables and I'm trying to generate a new table with data from the 2 tables.
Jobs table:
jobs (
id SERIAL PRIMARY KEY,
position TEXT NOT NULL,
location TEXT NOT NULL,
pay NUMERIC NOT NULL,
duration TEXT NOT NULL,
description TEXT NOT NULL,
term TEXT NOT NULL,
user_id INTEGER REFERENCES users(id) ON DELETE SET NULL
)
Applied table:
applied (
id SERIAL PRIMARY KEY,
completed BOOLEAN DEFAULT FALSE,
user_id INTEGER REFERENCES users(id) ON DELETE SET NULL,
job_id INTEGER REFERENCES jobs(id) ON DELETE SET NULL,
UNIQUE (user_id, job_id)
);
The tabled populated with data look like this:
Jobs table
Applied table
I want my final query to be a table that matches the jobs table but that has a new column called js_id with true or false based on whether the user has applied to that job. I want the table to look like this:
Here is the query I came up with to generate the above table:
SELECT DISTINCT on (jobs.id)
jobs.*, applied.user_id as applicant,
CASE WHEN applied.user_id = 1 THEN TRUE
ELSE FALSE END as js_id
FROM jobs
JOIN applied on jobs.id = applied.job_id;
However this doesn't work as I add more applicants to the table. I get different true and false values and I haven't been able to get it working. When I remove DISTINCT on (jobs.id) my true values are consistent but I wind up with a lot more than the 3 jobs I want. Here are the results without the DISTINCT on (jobs.id):

I think you want exists:
SELECT j.*,
(exists (select 1
from applied a
where a.job_id = j.id and a.user_id = 1
) as js_id
FROM jobs j;

Related

Insert into table1 using data from staging_table1 and table2, while using staging_table1 to get the data from table2

Goal: Insert all data into a table from staging table. Each piece of data in the staging table has 2 names which can be found in a separate table. By using the 2 two names, I want to find their respective IDs and insert them into the foreign keys of the main table.
Question: How do I insert the data from a staging table into a table while using data from the staging to query IDs from a separate table?
Example tables:
TABLE location:
id int PRIMARY KEY,
location varchar(255) NOT NULL,
person_oneID int FOREIGN KEY REFERENCES people(person_id),
person_twoID int FOREIGN KEY REFERENCES people(person_id)
TABLE staging_location:
id int PRIMARY KEY,
location varchar(255) NOT NULL,
p1_full_name varchar(255) NOT NULL,
p2_full_name varchar(255) NOT NULL
TABLE people:
person_id int PRIMARY KEY,
first_name varchar(255) NOT NULL,
last_name varchar(255) NOT NULL,
full_name varchar(255) NOT NULL,
This question was the closest example to what I have been looking for. Though I haven't been able to get the query to work. Here is what I've tried:
INSERT INTO location(id,location,person_oneID,person_twoID)
SELECT (l.id,l.location,p1.person_oneID,p2.person_twoID)
FROM staging_location AS l
INNER JOIN people p1 ON p1.full_name = l.p1_full_name
INNER JOIN people p2 ON p2.full_name = l.p2_full_name
Additional info: I would like to do this in the same insert statement without using an update because of the number of locations being inserted. I'm using staging tables as a result of importing data from csv files. The csv file with people didn't have an ID field, so I created one for each person by following steps similar to the first answer from this question. Please let me know if any additional information is required or if I can find the answer to my question somewhere I haven't seen.
Use this code even though I do not know what your data structure is and a duplicate field may be inserted
INSERT INTO location(id,location,person_oneID,person_twoID)
SELECT (l.id,l.location,p1.person_id as person_oneID,p2.person_id as person_twoID)
FROM staging_location AS l
INNER JOIN people p1 ON p1.full_name = l.p1_full_name
INNER JOIN people p2 ON p2.full_name = l.p2_full_name

How to update a column in a table A using the value from another table B wherein the relationship between tables A & B is 1:N by using max() function

I have two tables namely loan_details and loan_his_mapping with 1:N relationship. I need to set the hhf_request_id of loan_details table by the value which is present in the loan_his_mapping table for each loan.
Since the relationship is 1:N , I want to consider the record for each loan from loan_his_mapping table with two conditions mentioned below. The table definitions are as follows:
CREATE TABLE public.loan_details
(
loan_number bigint NOT NULL,
hhf_lob integer,
hhf_request_id integer,
status character varying(100),
CONSTRAINT loan_details_pkey PRIMARY KEY (loan_number)
);
CREATE TABLE public.loan_his_mapping
(
loan_number bigint NOT NULL,
spoc_id integer NOT NULL,
assigned_datetime timestamp without time zone,
loan_spoc_map_id bigint NOT NULL,
line_of_business_id integer,
request_id bigint,
CONSTRAINT loan_spoc_his_map_id PRIMARY KEY (loan_spoc_map_id),
CONSTRAINT fk_loan_spoc_loan_number_his FOREIGN KEY (loan_number)
REFERENCES public.loan_details (loan_number) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION );
The joining conditions while updating are:
The Records of loan_details with hhf_lob = 4 and status='Release'
I should consider that record for updating value among 'N' number of records from loan_his_mapping table with value max(loan_spoc_map_id) for each loan.
The query I have right now
update lsa_loan_details ldet
set hhf_request_id = history.request_id
from loan_his_mapping history
where ldet.loan_number = history.loan_number and ldet.status='Release' and ldet.hhf_lob=4 and
history.line_of_business_id=4 ;
I want to know how to use that record for each loan from loan_his_mapping with max(loan_spoc_map_id) to update column of loan_details table. Please Assist!
You need a sub-query to fetch the row corresponding to the highest loan_spoc_map_id
Something along the lines:
update loan_details ldet
set hhf_request_id = history.request_id
from (
select distinct on (loan_spoc_map_id) loan_number, request_id
from loan_his_mapping lhm
where lhm.line_of_business_id = 4
order by loan_spoc_map_id desc
) as history
where ldet.loan_number = history.loan_number
and ldet.status = 'Release'
and ldet.hhf_lob = 4;

Query optimization: connecting meta data to a value list table

I have a database containing a table with data and a meta data table. I want to create a View that selects certain meta data belonging to an item and list it as a column.
The basic query for the view is: SELECT * FROM item. The item table is defined as:
CREATE TABLE item (
id INTEGER PRIMARY KEY AUTOINCREMENT
UNIQUE
NOT NULL,
traceid INTEGER REFERENCES trace (id)
NOT NULL,
freq BIGINT NOT NULL,
value REAL NOT NULL
);
The meta data to be added follow the schema "metadata.parameter='name'"
The meta table is defined as:
CREATE TABLE metadata (
id INTEGER PRIMARY KEY AUTOINCREMENT
UNIQUE
NOT NULL,
parameter STRING NOT NULL
COLLATE NOCASE,
value STRING NOT NULL
COLLATE NOCASE,
datasetid INTEGER NOT NULL
REFERENCES dataset (id),
traceid INTEGER REFERENCES trace (id),
itemid INTEGER REFERENCES item (id)
);
The "name" parameter should be selected this way:
if a record exists where parameter is "name" and itemid matches item.id, then its value should be included in the record.
otherwise, if a record exists where parameter is "name", "itemid" is NULL, and traceid matches item.traceid, its value should be used
otherwise, the result should be NULL, but the record from the item table should be included anyway
Currently, I use the following query to achieve this goal:
SELECT i.*,
COALESCE (
MAX(CASE WHEN m.parameter='name' THEN m.value END),
MAX(CASE WHEN m2.parameter='name' THEN m2.value END)
) AS itemname
FROM item i
JOIN metadata m
ON (m.itemid = i.id AND m.parameter='name')
JOIN metadata m2
ON (m2.itemid IS NULL AND m2.traceid = i.traceid AND m2.parameter='name')
GROUP BY i.id
This query however is somewhat inefficient, as the metadata table is used twice and contains many more records than just the "name" ones. So I am looking for a way to improve speed, especially regarding the case that some extensions are about to be implemented:
there is a third level "dataset" that should be included: a "parameter=name" should be used if it has the same datasetid as the item (will be looked up for the items by searching another which connects traceid and datasetid), if no "parameter=name" exists with either "itemid" matching or "traceid" matching
more meta data should be queried by the view following the same schema
Any help is appreciated.
First of all, you can use one join instead of 2, like this:
JOIN metadata m ON (m.parameter='name' AND (m.itemId = i.id OR (m.itemId IS NULL AND m.traceid = i.traceid)))
Then you can remove COALESCE, using simple select:
SELECT i.*, m.value as itemname
Result should look like this:
SELECT i.*, m.value as itemname
FROM item i
JOIN metadata m ON (m.parameter='name' AND (m.itemId = i.id OR (m.itemId IS NULL AND m.traceid = i.traceid)))
GROUP BY i.id

Generate a sub job

PS: Added a new image to better describe what i would like to achieve
As I couldn't find a way to phase the question, therefore it limits the possibility that I could get a ''ready'' solution, pardon me if this is available.
I am self-learning SQL, and would hope to gain some valuable lessons and information on how to write something as the following, greatly appreciated!
Seeking to write lines that allows me to add a master job (eg. 05-16-00000)
in each master job, there will be other "jobs" so it should generate (eg. 05-16-0000 - 01 .. XX). How can I have it written in a way?
[2
[]2
Just hold an id for each record and if a row has a parent, you set a parent_job_id to the corresponding id. Rows with no parent have the parent_job_id set to NULL.
CREATE TABLE `dbname`.`job`
( `id` BIGINT NOT NULL,
`description` VARCHAR(45) NULL,
`parent_job_id` BIGINT NULL,
PRIMARY KEY (`id`)
);
Get master jobs:
SELECT
`job`.`id`,
`job`.`description`,
`job`.`parent_job_id`
FROM
`testdb`.`job`
WHERE
`job`.`parent_job_id` IS NULL
;
If you are looking for children of job 3 replace the WHERE clause with
WHERE `job`.`parent_job_id` = 3
As you show in your later added example you want to m:n link the table with itself. Create a table with parent and child IDs.
CREATE TABLE `dbname`.`job_parent_child`
( `parent_id` BIGINT NOT NULL,
`child_id` BIGINT NOT NULL,
PRIMARY KEY (`parent_id`, `child_id`)
);
Same example - get all childs with parent job 3
SELECT * FROM `dbname`.`job` AS `child`
INNER JOIN `dbname`.`job_parent_child` AS `mn`
ON `child`.`id` = `mn`.`child_id`
WHERE `mn`.`parent_id` = 3
;
According to your last edit just select the job ids (and possibly other data if needed) from the table and iterate over the rows.
SELECT DISTINCT `JOB ID` FROM `jobs`;
Output the master job row of the html table. Then query with prepared statement
SELECT * FROM `jobs` WHERE `JOB ID` = ?;
Output all the rows. That's really all.

SQL JOIN To Find Records That Don't Have a Matching Record With a Specific Value

I'm trying to speed up some code that I wrote years ago for my employer's purchase authorization app. Basically I have a SLOW subquery that I'd like to replace with a JOIN (if it's faster).
When the director logs into the application he sees a list of purchase requests he has yet to authorize or deny. That list is generated with the following query:
SELECT * FROM SA_ORDER WHERE ORDER_ID NOT IN
(SELECT ORDER_ID FROM SA_SIGNATURES WHERE TYPE = 'administrative director');
There are only about 900 records in sa_order and 1800 records in sa_signature and this query still takes about 5 seconds to execute. I've tried using a LEFT JOIN to retrieve records I need, but I've only been able to get sa_order records with NO matching records in sa_signature, and I need sa_order records with "no matching records with a type of 'administrative director'". Your help is greatly appreciated!
The schema for the two tables is as follows:
The tables involved have the following layout:
CREATE TABLE sa_order
(
`order_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_number` BIGINT NOT NULL,
`submit_date` DATE NOT NULL,
`vendor_id` BIGINT NOT NULL,
`DENIED` BOOLEAN NOT NULL DEFAULT FALSE,
`MEMO` MEDIUMTEXT,
`year_id` BIGINT NOT NULL,
`advisor` VARCHAR(255) NOT NULL,
`deleted` BOOLEAN NOT NULL DEFAULT FALSE
);
CREATE TABLE sa_signature
(
`signature_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_id` BIGINT NOT NULL,
`signature` VARCHAR(255) NOT NULL,
`proxy` BOOLEAN NOT NULL DEFAULT FALSE,
`timestamp` TIMESTAMP NOT NULL DEFAULT NOW(),
`username` VARCHAR(255) NOT NULL,
`type` VARCHAR(255) NOT NULL
);
Create an index on sa_signatures (type, order_id).
This is not necessary to convert the query into a LEFT JOIN unless sa_signatures allows nulls in order_id. With the index, the NOT IN will perform as well. However, just in case you're curious:
SELECT o.*
FROM sa_order o
LEFT JOIN
sa_signatures s
ON s.order_id = o.order_id
AND s.type = 'administrative director'
WHERE s.type IS NULL
You should pick a NOT NULL column from sa_signatures for the WHERE clause to perform well.
You could replace the [NOT] IN operator with EXISTS for faster performance.
So you'll have:
SELECT * FROM SA_ORDER WHERE NOT EXISTS
(SELECT ORDER_ID FROM SA_SIGNATURES
WHERE TYPE = 'administrative director'
AND ORDER_ID = SA_ORDER.ORDER_ID);
Reason : "When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, query can use an index within the sub-query."
Source : http://decipherinfosys.wordpress.com/2007/01/21/32/
This following query should work, however I suspect your real issue is you don't have the proper indices in place. You should have an index on the SA_SGINATURES table on the ORDER_ID column.
SELECT *
FROM
SA_ORDER
LEFT JOIN
SA_SIGNATURES
ON
SA_ORDER.ORDER_ID = SA_SIGNATURES.ORDER_ID AND
TYPE = 'administrative director'
WHERE
SA_SIGNATURES.ORDER_ID IS NULL;
select * from sa_order as o inner join sa_signature as s on o.orderid = sa.orderid and sa.type = 'administrative director'
also, you can create a non clustered index on type in sa_signature table
even better - have a master table for types with typeid and typename, and then instead of saving type as text in your sa_signature table, simply save type as integer. thats because computing on integers is way faster than computing on text