full outer join in redshift - sql

I have 2 tables A and B with columns, containing some details of students (all columns are integer):
A:
st_id,
st_subject_id,
B:
st_id,
st_subject_id,
st_count1,
st_count2
st_id means student id, st_subject_id is subject id.
For student id 15, there are following entries:
A:
15 | 1
15 | 2
15 | 3
B:
15 | 1 | 31 | 11
15 | 2 | 30 | 14
15 | 4 | 21 | 6
15 | 5 | 26 | 9
3 subjects in table A and 4 subjects(2 matching with table A and 2 extra) in table B.
I want to display the final result as:
15 | 1 | 31 | 11
15 | 2 | 30 | 14
15 | 3 | null | null
15 | 4 | 21 | 6
15 | 5 | 26 | 9
Can this be done using full outer join in SQL, or by another method?

I think something like this would suffice, but I can't test right now.
Coalesce means that the first non-null value will be selected from both tables.
select
coalesce(A.st_id, B.st_id) st_id,
coalesce(A.st_subject_id, B.st_subject_id) st_subject_id,
B.st_count1,
B.st_count2
from A
full outer join B
on A.st_id = B.st_id and A.st_subject_id = B.st_subject_id

Related

How to get columns when using buckets (width_bucket)

I would like to know which row were moved to a bucket.
SELECT
width_bucket(s.score, sl.mins, sl.maxs, 9) as buckets,
COUNT(*)
FROM scores s
CROSS JOIN scores_limits sl
GROUP BY 1
ORDER BY 1;
My actual return:
buckets | count
---------+-------
1 | 182
2 | 37
3 | 46
4 | 15
5 | 29
7 | 18
8 | 22
10 | 11
| 20
What I expect to return:
SELECT buckets FROM buckets_table [...] WHERE scores.id = 1;
How can I get, for example, the column 'id' of table scores?
I believe you can include the id in an array with array_agg. If I recreate your case with
create table test (id serial, score int);
insert into test(score) values (10),(9),(5),(4),(10),(2),(5),(7),(8),(10);
The data is
id | score
----+-------
1 | 10
2 | 9
3 | 5
4 | 4
5 | 10
6 | 2
7 | 5
8 | 7
9 | 8
10 | 10
(10 rows)
Using the following and aggregating the id with array_agg
SELECT
width_bucket(score, 0, 10, 11) as buckets,
COUNT(*) nr_ids,
array_agg(id) agg_ids
FROM test s
GROUP BY 1
ORDER BY 1;
You get
buckets | nr_ids | agg_ids
---------+--------+----------
3 | 1 | {6}
5 | 1 | {4}
6 | 2 | {3,7}
8 | 1 | {8}
9 | 1 | {9}
10 | 1 | {2}
12 | 3 | {1,5,10}

Select all the records in the first table that match each of the records in the second

I'm working with an Access database and have two tables:
ID_1
Number
Some other data
1
1
Data
2
2
Data
3
3
Data
4
4
Data
5
3
Data
6
1
Data
7
2
Data
8
3
Data
9
1
Data
10
1
Data
11
2
Data
12
3
Data
13
4
Data
14
1
Data
15
2
Data
16
3
Data
17
4
Data
18
3
Data
19
3
Data
ID_2
Number
Some other data
1
3
Data
2
1
Data
3
2
Data
4
3
Data
5
2
Data
As you see, both tables have duplicate data. I need a query that would select all the records in the first table that match each of the records in the second, they are related by Number field. It's also necessary that these records aren't repeated (that is, that the query doesn't repeat values when selecting). For the given example I should get this result:
ID
ID_1
Number
Some other data
1
3
3
Data
2
5
3
Data
3
8
3
Data
4
12
3
Data
5
16
3
Data
6
18
3
Data
7
19
3
Data
8
1
1
Data
9
6
1
Data
10
9
1
Data
11
10
1
Data
12
14
1
Data
13
2
2
Data
14
7
2
Data
15
11
2
Data
16
15
2
Data
I was thinking that maybe I could use Join, but I still don't know how; tried Where, but also didn't find a use for it. Could you please help me with that?
I don't see where you're generating your output ID field from - or where you're picking your Data field from so here's the best guess.
SELECT Table1.ID_1, Table1.Number, Table1.[Some other data]
FROM Table1
WHERE (Table1.Number In (SELECT Number From Table2))
ORDER BY Table1.Number, Table1.ID_1;
Looks like this:
MySql DB data structure
create table tbl1(ID_1 serial, Number int);
create table tbl2(ID_2 serial, Number int);
insert into tbl1(Number) values (1),(2),(3),(4),(3),(1),(2),(3),(1),(1),(2),(3),(4),(1),(2),(3),(4),(3),(3);
insert into tbl2(Number) values (3),(1),(2),(3),(2);
query (with s), needed to remove duplicates
the window function count(tbl1.Number) OVER(PARTITION BY Number) sorts the result for us by the count of matched numbers
the #rownum variable is needed to count rows
with s as (select distinct Number from tbl2),
f as (select ID_1,tbl1.Number from tbl1 left join s on
(tbl1.Number=s.Number) where s.Number is not null order by
count(tbl1.Number) OVER(PARTITION BY Number) desc)
select #rownum := #rownum + 1 AS ID,ID_1,Number from f, (SELECT #rownum := 0) r;
results
+------+------+--------+
| ID | ID_1 | Number |
+------+------+--------+
| 1 | 3 | 3 |
| 2 | 5 | 3 |
| 3 | 8 | 3 |
| 4 | 12 | 3 |
| 5 | 16 | 3 |
| 6 | 18 | 3 |
| 7 | 19 | 3 |
| 8 | 1 | 1 |
| 9 | 6 | 1 |
| 10 | 9 | 1 |
| 11 | 10 | 1 |
| 12 | 14 | 1 |
| 13 | 2 | 2 |
| 14 | 7 | 2 |
| 15 | 11 | 2 |
| 16 | 15 | 2 |
+------+------+--------+

select only tuples where second column always has same value

I have a similar table to this one
ID | CountryID
1 | 22
1 | 22
2 | 19
3 | 0
3 | 14
3 | 18
3 | 21
3 | 22
3 | 23
4 | 19
5 | 9
5 | 9
6 | 14
and I want to group by the first ID column but select only rows, where the CountryID has the same value throughout an ID. The resulting table should look like
ID | CountryID
1 | 22
2 | 19
4 | 19
5 | 9
6 | 14
Any ideas?
I think the following query should work:
SELECT ID, MAX(CountryID)
FROM Table1
GROUP BY ID
HAVING MIN(CountryID) = MAX(CountryID)
SELECT ID, count(distinct CountryID)
FROM Table1
GROUP BY ID
HAVING count(distinct CountryID)=1

A single query to count the number of distinct rows in one table and the highest value of a column from another table

I have two SQL tables. Table 1 is as follows:
SALEREF
1 | 40303020
2 | 40303021
3 | 40303021
4 | 40303021
5 | 41210028
6 | 4120302701
7 | 41210030
8 | 4112700803
9 | 4112700803
10 | 41215030
11 | 41215026
12 | 41215026
13 | 41215026
14 | 41215026
15 | 41215026
16 | 41215026
17 | 41215026
18 | 41215027
19 | 41215027
20 | 41215027
Table 2 ("LEDGER") is as follows:
SALESREF SALEDATE
0 | 4081200201 | 20140804
1 | 40303020 | 20141015
2 | 40303021 | 20141017
3 | 40303021 | 20141017
4 | 40303021 | 20141017
5 | 41210028 | 20121214
6 | 4120302701 | 20130926
7 | 41210030 | 20130926
8 | 4112700803 | 20131107
9 | 4112700803 | 20131107
10 | 41215030 | 20120720
What I am looking for is a single line that outputs the following:
TotalDistinctSalesRefsInTable1 HighestSaleDateValueInTable2 (that has a matching value in table 1)
9 20141017
the total number of distinct SALESREF's in table 1 and the latest SALESDATE value from table 2.
I've tried selecting within a query but quickly found the limitation of my knowledge although I know I can get the latest overall sale date by doing:
SELECT MAX(LEDGER.SALEDATE) AS LAST_DATE FROM LEDGER
I just need help piecing the whole thing together.
you can use left join , count and max to get your desired result
select count(distinct t1.salesref) as TotalDistinctSalesRefsInTable1,
ifnull(max(l.saledate),0) as HighestSaleDateValueInTable
from table1 t1
left join ledger l
on t1.salesref = l.salesref

Simple psql count query

I am very new to postgresql and would like to generate some summary data from our table
We have a simple message board - table name messages which has an element ctg_uid. Each ctg_uid corresponds to a category name in the table categories.
Here are the categories select * from categories ORDER by ctg_uid ASC;
ctg_uid | ctg_category | ctg_creator_uid
---------+--------------------+-----------------
1 | general | 1
2 | faults | 1
3 | computing | 1
4 | teaching | 2
5 | QIS-FEEDBACK | 3
6 | QIS-PHYS-FEEDBACK | 3
7 | SOP-?-CHANGE | 3
8 | agenda items | 7
10 | Acq & Process | 2
12 | physics-jobs | 3
13 | Tech meeting items | 12
16 | incident-forms | 3
17 | ERRORS | 3
19 | Files | 10
21 | QIS-CAR | 3
22 | doses | 4
24 | admin | 3
25 | audit | 3
26 | For Sale | 4
31 | URGENT-REPORTS | 4
34 | dt-jobs | 3
35 | JOBS | 3
36 | IN-PATIENTS | 4
37 | Ordering | 4
38 | dep-meetings | 4
39 | reporting | 4
What I would like to do is for all messages on our messages is count the frequency of each category
I can do it on a category by category basis
SELECT count(msg_ctg_uid) FROM messages where msg_ctg_uid='13';
However is it possible to do this in a one liner?
The following gives the the category and ctg_uid for each message
SELECT ctg_category, msg_ctg_uid FROM messages INNER JOIN categories ON (ctg_uid = msg_ctg_uid);
but SELECT ctg_category, count(msg_ctg_uid) FROM messages INNER JOIN categories ON (ctg_uid = msg_ctg_uid);
gives me the error ERROR: column "categories.ctg_category" must appear in the GROUP BY clause or be used in an aggregate function
How do I aggregate the frequency of each category ?
You're missing the group by clause:
SELECT ctg_category, count(msg_ctg_uid)
FROM messages INNER JOIN categories ON (ctg_uid = msg_ctg_uid);
GROUP BY ctg_category
this means you want the count per ctg_category