Join table rows with empty key - sql

I have these 2 tables which I would like to query:
create table active_pairs
(
pair text,
exchange_id integer
);
create table exchanges
(
exchange_id integer,
exchange_full_name text
);
INSERT INTO active_pairs (pair, exchange_id)
VALUES ('London/Berlin', 2),
('London/Berlin', 3),
('Paris/Berlin', 4),
('Paris/Berlin', 3),
('Oslo/Berlin', 2),
('Oslo/Berlin', 6),
('Huston/Berlin', 2);
INSERT INTO exchanges (exchange_id)
VALUES (2, 'Exchange 1'),
(3, 'Exchange 2'),
(4, 'Exchange 3'),
(3, 'Exchange 21'),
(2, 'Exchange 12'),
(6, 'Exchange 11'),
(2, 'Exchange 31');
I use these queries to list all pairs:
Query to list items:
SELECT * FROM common.active_pairs ap
INNER JOIN common.exchanges ce on ap.exchange_id = ce.exchange_id
WHERE ap.exchange_id = 1
GROUP BY pair, ap.exchange_id, ce.exchange_id, ap.id
HAVING COUNT(ap.pair) = 1;
I get as a result 172 rows.
Query to count rows to calculate pagination:
SELECT DISTINCT COUNT(*) OVER () counter
FROM common.active_pairs cp
INNER JOIN common.exchanges ce on cp.exchange_id = ce.exchange_id
WHERE cp.exchange_id = 1
GROUP BY pair
HAVING COUNT(cp.pair) = 1
I get as a result 158 rows.
I should be able to get equal total numbers from both queries in order to calculate properly pagination.
Is it possible that records with empty exchange_id in giving the different result?

You do not group by active_pairs.exchange_id but only by the name of the pair (active_pairs.pair). If that name is not unique, pairs with the same name but different ID are counted as one in your pagination query.

Related

SQL - extracting data from 3 tables

I'm new to sql and I'm wondering how to extract the relevant data from the sites and plugins table using the sites_plugins table. The data I am interested in is sites.description, plugins.fullName, plugins.currentVersion and sites_plugins.syncedAt
Below are the sql tables
INSERT INTO sites (id, name, description) VALUES
(1, "facebook", "Facebook"),
(2, "amazon", "Amazon"),
(3, "google", "Google");
INSERT INTO plugins (id, name, fullName, currentVersion) VALUES
(1, "yoast", "Yoast SEO", "16.8"),
(2, "jetpack", "Jetpack", "9.9.1"),
(3, "akismet", "Akismet", "4.1.10"),
(4, "wordfence", "Wordfence Security", "7.5.4"),
(5, "contact-form", "Contact Form 7", "7.5.4.2");
INSERT INTO sites_plugins (siteId, pluginId, version, syncedAt) VALUES
(1, 1, "16.8", NULL),
(1, 3, "3.8", '2021-07-01 10:00:00'),
(2, 3, "4.1.10", NULL),
(2, 5, "7.0", NULL),
(2, 4, "7.5.3", '2021-06-15 12:00:00');
Ultimately, I would very much like to achieve a data format like the following
{["Amazon", "Jetpack", "16.8", "NULL"]}
Thanks for any advice, Best Kacper
You must join all tables by thier linking column and pick for the SELECT the wanted columns, which you can freely pick from all tables
SELECT
s.description
,p.fullName
,sp.version
, sp.syncedAt
FROM sites_plugins sp
INNER JOIN sites s ON s.id = sp.siteId
INNER JOIN plugins p = p.id = sp.pluginId
But your wanted result is not possible with the data you provided

Count Distinct not working as expected, output is equal to count

I have a table where I'm trying to count the distinct number of members per group. I know there's duplicates based on the count(distinct *) function. But when I try to group them into the group and count distinct, it's not spitting out the number I'd expect.
select count(distinct memberid), count(*)
from dbo.condition c
output:
count
count
303,781
348,722
select groupid, count(*), count(distinct memberid)
from dbo.condition c
group by groupid
output:
groupid
count
count
2
19,984
19,984
3
25,689
25,689
5
14,400
14,400
24
56,058
56,058
25
200,106
200,106
29
27,847
27,847
30
1,370
1,370
31
3,268
3,268
The numbers in the second query equate when they shouldn't be. Does anyone know what I'm doing wrong? I need the 3rd column to be equal to 303,781 not 348,722.
Thanks!
There's nothing wrong with your second query. Since you're aggregating on the "groupid" field, the output you get tells you that there are no duplicates within the same groupid of the "memberid" values (basically counting values equates to counting distinctively).
On the other hand, in the first query the aggregation happens without any partitioning, whose output hints there are duplicate values across different "groupid" values.
Took the liberty of adding of an example that corroborates your answer:
create table aa (groupid int not null, memberid int not null );
insert into aa (groupid, memberid)
values
(1, 1), (1, 2), (1, 3), (2, 1), (3, 1), (3, 2), (3, 3), (4, 1), (4, 2), (4, 3), (4, 5), (5, 3)
select groupid, count(*), count(distinct memberid)
from aa group by groupid;
select count(*), count(distinct memberid)
from aa

running sums, find blocks of rows that sum to given list of values

here is the test data:
declare #trial table (id int, val int)
insert into #trial (id, val)
values (1, 1), (2, 3),(3, 2), (4, 4), (5, 5),(6, 6), (7, 7), (8, 2),(9, 3), (10, 4), (11, 6),(12, 10), (13, 5), (14, 3),(15, 2) ;
select * from #trial order by id asc
description of data:
i have a list of n values that represent sums. assume they are (10, 53) for this example. the values in the #trial can be both negative & positive. note that the values in #trial will always sum to the given sums.
description of pattern:
10 in this example is the 1st sum i want to match & 53 is the 2nd sum i want to match. the dataset has been set up in such a way that a block of consecutive rows will always sum to these sums with this feature: in this example, the 1st 4 rows sum to 10, & then the next 11 rows sum to 53. the dataset will always have this feature. in other words, the 1st given sum can be found from summing 1 to ith row, then 2nd sum from i + 1 row to jth row, & so on....
finally i want an id to identify the groups of rows that sum to the given sums. so in this example, 1 to 4th row will take id 1, 5th to 15th row will take id 2.
This answers the original question.
From what you describe you can do something like this:
select v.grp, t.*
from (select t.*, sum(val) over (order by id) as running_val
from #trial t
) t left join
(select grp lag(upper, 1, -1) over (order by upper) as lower, uper
from (values (1, 10), (2, 53)) v(grp, upper)
) v
on t.running_val > lower and
t.running_val <= v.upper

SQL Select: Do rows matching id all have the same column value

I have a table like this
sub_id reference
1 A
1 A
1 A
1 A
1 A
1 A
1 C
2 B
2 B
3 D
3 D
I want to make sure all the references in each group have the same reference.
Meaning, for example, all references in:
group 1 should be A
group 2 should be B
group 3 should be D
If they are not, then I would like to have returned a list of sub_id's.
So for the table above my result would be: 1
Ideally, with these conditions reference would be in a separate table with sub_id as PK, but I need to fix first for a massive dataset before I can move on restructuring the database.
You could use the following method:
select t.sub_id
from YourTable t
group by t.sub_id
having max(t.reference) <> min(t.reference)
Change YourTable to suit.
Are you looking for simple aggregation ?
select sub_id
from table t
group by sub_id
having count(distinct reference) > 1;
The query you want:
SELECT sub_id
FROM test_sub
GROUP BY sub_id HAVING count(DISTINCT reference) > 1
;
Here is what I used to test it:
CREATE TABLE `test_sub` (
sub_id int(11) NOT NULL,
reference varchar(45) DEFAULT NULL
);
INSERT INTO test_sub (sub_id, reference) VALUES
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'C'),
(2, 'B'),
(2, 'B'),
(3, 'D'),
(3, 'D'),
(3, 'D'),
(4, 'E'),
(4, 'E'),
(4, 'E'),
(5, 'F'),
(5, 'G')
;

Select TOP columns from table1, join table2 with their names

I have a TABLE1 with these two columns, storing departure and arrival identifiers from flights:
dep_id arr_id
1 2
6 2
6 2
6 2
6 2
3 2
3 2
3 2
3 4
3 4
3 6
3 6
and a TABLE2 with the respective IDs containing their ICAO codes:
id icao
1 LPPT
2 LPFR
3 LPMA
4 LPPR
5 LLGB
6 LEPA
7 LEMD
How can i select the top count of TABLE1 (most used departure id and most used arrival id) and group it with the respective ICAO code from TABLE2, so i can get from the provided example data:
most_arrivals most_departures
LPFR LPMA
It's simple to get ONE of them, but mixing two or more columns doesn't seem to work for me no matter what i try.
You can do it like this.
Create and populate tables.
CREATE TABLE dbo.Icao
(
id int NOT NULL PRIMARY KEY,
icao nchar(4) NOT NULL
);
CREATE TABLE dbo.Flight
(
dep_id int NOT NULL
FOREIGN KEY REFERENCES dbo.Icao(id),
arr_id int NOT NULL
FOREIGN KEY REFERENCES dbo.Icao(id)
);
INSERT INTO dbo.Icao (id, icao)
VALUES
(1, N'LPPT'),
(2, N'LPFR'),
(3, N'LPMA'),
(4, N'LPPR'),
(5, N'LLGB'),
(6, N'LEPA'),
(7, N'LEMD');
INSERT INTO dbo.Flight (dep_id, arr_id)
VALUES
(1, 2),
(6, 2),
(6, 2),
(6, 2),
(6, 2),
(3, 2),
(3, 2),
(3, 2),
(3, 4),
(3, 4),
(3, 6),
(3, 6);
Then do a SELECT using two subqueries.
SELECT
(SELECT TOP 1 I.icao
FROM dbo.Flight AS F
INNER JOIN dbo.Icao AS I
ON I.id = F.arr_id
GROUP BY I.icao
ORDER BY COUNT(*) DESC) AS 'most_arrivals',
(SELECT TOP 1 I.icao
FROM dbo.Flight AS F
INNER JOIN dbo.Icao AS I
ON I.id = F.dep_id
GROUP BY I.icao
ORDER BY COUNT(*) DESC) AS 'most_departures';
Click this button on the toolbar to include the actual execution plan, when you execute the query.
And this is the graphical execution plan for the query. Each icon represents an operation that will be performed by the SQL Server engine. The arrows represent data flows. The direction of flow is from right to left, so the result is the leftmost icon.
try this one:
select
(select name
from table2 where id = (
select top 1 arr_id
from table1
group by arr_id
order by count(*) desc)
) as most_arrivals,
(select name
from table2 where id = (
select top 1 dep_id
from table1
group by dep_id
order by count(*) desc)
) as most_departures