SQL Queries - Join two tables - sql

I have two tables
TABLE 1 - Called Artista (artist) with an ID, Name, first year, second year.
ID NAME year1 year2 COUNTRY
41 Filipe Nobrega 2001 2051 Portugal
42 Bernardo Morais 2010 2060 Portugal
43 Fernando Evora 2013 2070 Portugal
44 Florenzo Giovanni 2003 2047 Italia
45 Tiago Alves 1980 1990 Portugal
46 Rui Gonzales 1975 1995 Espanha
47 Jose Almeida 1800 1876 Portugal
48 Jhon Snow 1900 1940 Winterfell
49 test 2001 2020 Espanha
TABLE 2 - Called autoria (author), with the ID of a piece of art and the ID of an artist, also it has the type of art( painting, music, sculpture...)
ART ARTIST TYPE_OF_ART
121 41 Pintura
122 41 Musica
123 42 Pintura
124 42 Cinema
125 42 Literatura
126 43 Teatro
127 43 Literatura
128 43 Danca
129 43 Arte_digital
130 43 Pintura
131 44 Pintura
132 44 Cinema
133 44 Pintura
134 45 Cinema
135 45 Literatura
136 46 Cinema
137 46 Literatura
138 46 Literatura
139 47 Arte_digital
140 47 Pintura
141 47 Teatro
142 48 Cinema
The problem is: Get all the artists that made less than 2 different pieces of art.
The result should be:
FILIPE NOBREGA - 41 he has 2 pieces of art
TIAGO ALVES - 45 he has 2 pieces of art
JOHN SNOW - 48 he has 1 piece of art
AND
TEST - 49 he has 0
This is what I've got:
SELECT DISTINCT A.name, A.id
FROM artista A, autoria AUT
WHERE AUT.artist = A.id
GROUP BY(A.name, A.id)
HAVING (COUNT(*) <= 2);
And it returns all of the above except TEST.

This query performs an INNER JOIN. You need an OUTER JOIN because autoria may not contain any records that join to Artista. And if it does not contain records that join, then an INNER JOIN does not include those in the result set. Change your query to use an OUTER JOIN:
SELECT DISTINCT A.name, A.id
FROM artista A LEFT OUTER JOIN autoria AUT ON AUT.artist = A.id
GROUP BY(A.name, A.id)
HAVING (COUNT(*) <= 2);

Related

BigQuery SQL: determine the number of daily transactions given a moving counter

I've been stuck for hours with writing a SQL query that would solve the following:
Given a history of a daily customer transaction counter, is it possible to specify exactly how many transactions were made each day?
Each datapoint represents sum of all transactions made in the last 30 days (ignore the missing dates)
The counter will decrement if the number of transactions made on the current day was smaller than the number of transactions that are no longer factored in, as they were made 31 days ago. It would increment otherwise.
The complete history of the counter is unavailable, so we don't know the numbers' evolution from the beginning, but only from certain point in time.
Please refer to the following table (for one offer_id):
transaction_date num_transactions
0 21/05/2022 25
1 22/05/2022 26
2 23/05/2022 25
3 24/05/2022 28
4 25/05/2022 30
5 26/05/2022 32
6 27/05/2022 33
7 28/05/2022 34
8 29/05/2022 33
9 30/05/2022 33
10 31/05/2022 34
11 01/06/2022 35
12 02/06/2022 35
13 03/06/2022 59
14 04/06/2022 73
15 07/06/2022 87
16 08/06/2022 98
17 09/06/2022 109
18 10/06/2022 120
19 11/06/2022 123
20 12/06/2022 122
21 13/06/2022 127
22 14/06/2022 142
23 15/06/2022 145
24 16/06/2022 148
25 17/06/2022 156
26 18/06/2022 162
27 19/06/2022 164
28 20/06/2022 167
29 21/06/2022 173
30 22/06/2022 185
31 23/06/2022 194
32 24/06/2022 206
33 25/06/2022 206
34 26/06/2022 208
35 28/06/2022 227
36 29/06/2022 237
37 30/06/2022 241
38 01/07/2022 248
39 02/07/2022 237
40 03/07/2022 230
41 04/07/2022 217
42 05/07/2022 208
43 06/07/2022 214
44 07/07/2022 216
45 08/07/2022 211
46 09/07/2022 203
47 10/07/2022 194
48 11/07/2022 192
49 12/07/2022 195
50 13/07/2022 193
51 14/07/2022 181
52 15/07/2022 174
53 16/07/2022 169
54 17/07/2022 162
55 18/07/2022 162
56 19/07/2022 164
57 20/07/2022 160
58 21/07/2022 163
59 22/07/2022 155
60 23/07/2022 144
61 24/07/2022 134
62 25/07/2022 139
63 26/07/2022 154
For each day (at least starting with 23/06) I'd like to be able to tell what were the numbers of transactions day-by-day in the preceding 30 days that sum up to the transactions counter on that day.
My current code in BigQuery SQL is below. It is obviously wrong - although the calculated counter evolution history does sum up to the right numbers when negative numbers are included, I'm interested in finding out only the actual transaction counts (thus only positive numbers and 0 are in question) for each last 30-days window.
When I include a simple condition that when a decrement happens, let's round it up to 0...:
WHEN IFNULL(transactions_diff_yesterday + transaction_reference, 0) < 0
THEN 0
... the sum for the last 30 days never matches the counter.
WITH outer_base AS(
WITH base AS(
SELECT
*,
LAG(num_transactions, 31) OVER(PARTITION BY offer_id ORDER BY offer_id, transaction_date) as transactions_31_days_ago,
IFNULL(LAG(num_transactions, 30) OVER(PARTITION BY offer_id ORDER BY offer_id, transaction_date), 0) as transactions_30_days_ago,
IFNULL(LAG(transactions, 1) OVER(PARTITION BY offer_id ORDER BY offer_id, transaction_date), 0) as transactions_yesterday
FROM
`my_table`
ORDER BY
offer_id,
transaction_date
)
SELECT
*,
IFNULL(transactions - transactions_yesterday, 0) AS transactions_diff_yesterday,
IFNULL(transactions_30_days_ago - transactions_31_days_ago, 0) AS transaction_reference
FROM
base
)
SELECT
*,
CASE
WHEN IFNULL(transactions_diff_yesterday + transaction_reference, 0) < 0
THEN 0
ELSE
IFNULL(transactions_diff_yesterday + transaction_reference, 0) END
AS real_transactions
FROM
outer_base;

Group questions by answers with SQL

SQL Server table:
userId
QuestionId
Question
AnswerId
Answer
32
98
What is the total salary in your family?
380
4000
32
99
How many are brothers?
385
5
33
98
What is the total salary in your family?
382
3000
33
99
How many are brothers?
385
5
34
98
What is the total salary in your family?
382
3000
34
99
How many are brothers?
385
5
35
98
What is the total salary in your family?
381
5000
35
99
How many are brothers?
384
4
36
98
What is the total salary in your family?
381
5000
36
99
How many are brothers?
383
3
37
98
What is the total salary in your family?
381
5000
37
99
How many are brothers?
383
3
38
98
What is the total salary in your family?
380
4000
38
99
How many are brothers?
385
5
39
98
What is the total salary in your family?
380
4000
39
99
How many are brothers?
385
5
41
98
What is the total salary in your family?
381
5000
41
99
How many are brothers?
383
3
I want to find the list of the number of common answers given to the questions
Example:
salary: 5000 brothers: 3 count = 3 user
Question1Id
Question2Id
Answer1
Answer2
count
98
99
3000
5
2
98
99
4000
5
3
98
99
5000
3
3
98
99
5000
4
1
Here you go:
select
a.questionid, b.questionid,
a.answer as answer1, b.answer as answer2, count(*) as count
from mytable a
join mytable b on a.userid = b.userid
where a.questionid = 98
and b.questionid = 99
group by a.questionid, b.questionid, a.answer, b.answer

Sql that uses one table and gives an output of all possible combinations

I have a table with the following information that I would like the output to be formatted with every combination. For each record there should be an instance of one other record next to it until it has gone through the complete file. What i want to do is use the 4 values to calculate a relationship between Vaue1 / Value2 and new Value1/ new Value2
id Value1 value2
100 34 48
101 35 45
102 22 15
103 35 17
104 37 10
and the output should be
100 34 48 101 35 45
100 34 48 102 22 15
100 34 48 103 35 17
100 34 48 104 37 10
101 35 45 102 22 15
101 35 45 103 35 17
101 35 45 104 37 10
102 22 15 103 35 17
102 22 15 104 37 10
103 22 15 104 37 10
As can been seen those are all the combinations of the sql table but i have thousands of these i want to do.
Will there be a sql query that i could get this formatting and going through the table making new rows on the output that are not duplicate.
Thank you
You can use join:
select t1.id, t1.value1, t1.value2, t2.id, t2.value1, t2.value2
from t t1 join
t t2
on t1.id < t2.id
order by t1.id, t2.id;

Rank() with Null first in Bigquery based on multiple columns

I have a data like as shown below
Subject_id T1 T2 T3 T4 T5
1234
1234 21 22 23 24 25
3456 34 31
3456 34 31 36 37 39
5678 65 64 62 61 67
5678 65 64 62 67
9876 12 13 14 15 16
4790 47 87 52 13 16
As you can see above, subject_ids 1234,3456 and 5678 are repeating.
I would like to remove those repeating subjects when they have null/empty/blank value in any of the columns like T1,T2,T3,T4,T5.
Now the problem is in real time, I have more than 250 columns and not sure whether I can put 250 where clause checking for null value. So, I was trying with row_number(), rank(). Not sure which one is better. The below is what I was trying
SELECT *,ROW_NUMBER() OVER(PARTITION BY subject_id,T1,T2,T3,T4,T5) NULLS FIRST
from table A;
But it throws syntax error Syntax error: Unexpected keyword NULLS at [1:62]
I expect my output to be like below
Subject_id T1 T2 T3 T4 T5
1234 21 22 23 24 25
3456 34 31 36 37 39
5678 65 64 62 61 67
9876 12 13 14 15 16
4790 47 87 52 13 16
As you can see, the output doesn't contain rows which had at least 1 null/empty/blank value in T1,T2,T3,T4,T5 columns.
Can help please?
Below is for BigQuery Standard SQL
#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE NOT REGEXP_CONTAINS(FORMAT('%t', t), r'NULL')
If to apply to sample data from your question - output is
Row Subject_id t1 t2 t3 t4 t5
1 1234 21 22 23 24 25
2 3456 34 31 36 37 39
3 5678 65 64 62 61 67
4 9876 12 13 14 15 16
5 4790 47 87 52 13 16
I think you want:
SELECT *,
ROW_NUMBER() OVER (PARTITION BY subject_id
ORDER BY (T1 IS NULL OR T2 IS NULL OR T3 IS NULL OR T4 IS NULL OR T5 IS NULL) DESC
)
FROM table A;
I might approach this problem differently, but this appears to be what you are trying to write.

Creating Groups of Consecutive Values in Access Query

To be clear, I'm not a developer, I'm just a business analyst trying to achieve something in Access which has stumped me.
I have a table of values as such:
Area Week
232 1
232 2
232 3
232 4
232 5
232 6
232 7
232 8
232 9
232 10
232 11
232 12
232 35
232 36
232 37
232 38
232 39
232 41
232 42
232 43
232 44
232 45
232 46
232 47
232 48
232 49
232 50
232 51
232 52
330 1
330 2
330 3
330 4
330 33
330 34
330 35
330 36
330 37
330 38
330 39
330 40
330 41
330 42
330 43
330 44
330 45
330 47
330 48
330 49
330 50
I would like to create a query using SQL in Access to create grouping as follows:
Area Code Week Start Week End
232 1 12
232 35 39
232 41 52
330 1 4
330 33 45
330 47 50
However everything I have read leads me to use the ROWNUM() function which is not native to Access.
I'm OK with general queries in Access, but am not very familiar with SQL.
How can I go about achieving my task?
Thanks
Mike
Use another database! MS Access doesn't have good functionality (in general).
You can do what you want, but it is expensive:
select area, min(week), max(week)
from (select t.*,
(select count(*)
from t as t2
where t2.area = t.area and t2.week <= t.week
) as seqnum
from t
) as t
group by area, (week - seqnum);
The correlated subquery is essentially doing row_number().