Trying to use SQL to group accounts by number of sub-types - sql

I'm using a table that houses account info. These accounts can have between 1 and 6 unique sub types. Currently it only tracks between single and multi subtypes but doesn't show the totals of how many of each multi sub-type account there are (how many accounts with 2 subtypes vs. 3 subtypes and so on). I'm looking for a wholly SQL way to view how many of each grouping of account types. There are a LOT of accounts in the table so pulling it manually isn't really an option. Is there a way I can get a count of each of the amount of sub-type groupings?
| account | Sub-Type | Single_V_Multi |
|---------|--------- | -------------- |
|123456789|123456789 | Multi |
|123456789|123456790 | Multi |
|123456789|123456791 | Multi |
|123456792|123456792 | Single |
|123456793|123456793 | Multi |
|123456793|123456794 | Multi |
|123456795|123456795 | Single |
|123456796|123456796 | Single |
|123456797|123456797 | Single |
|123456798|123456798 | Single |
|123456799|123456799 | Multi |
|123456799|123456800 | Multi |
|123456799|123456801 | Multi |
|123456799|123456802 | Multi |
From this example I'd be looking to get separate counts of the Account column based on the number of unique Sub-Type. What I've done so far is a query that groups the Sub-Types:
SELECT account, COUNT(DISTINCT(Sub-Type)) as BAN_SUB_COUNT
FROM Table
Which give the output:
| account | BAN_SUB_COUNT |
| ------- | ------------- |
|123456789| 3 |
|123456792| 1 |
|123456793| 2 |
|123456795| 1 |
|123456796| 1 |
|123456797| 1 |
|123456798| 1 |
|123456799| 4 |
What I need from this is a way to get a separate count of accounts for each of the distinct BAN_SUB_COUNT entries. Ideally it would be along the lines of:
| BAN_SUB_COUNT |count of Accounts|
| ------------- | --------------- |
| 1 | 5 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
Sorry for any confusion and I hope I'm explaining myself better here!

You just need to wrap your query with another one:
select ban_sub_count, count(distinct account) as count_of_accounts
from (
SELECT account, COUNT(DISTINCT Sub-Type ) as BAN_SUB_COUNT
FROM Table
group by account
)z
group by ban_sub_count
Output:
BAN_SUB_COUNT
count of Accounts
1
5
2
1
3
1
4
1

I try to answer your question:
select a2.*,a.`count_sub_type`
FROM (
select count(`sub-type`) as count_sub_type,`sub-type` from account group by `sub-type`
) a
left join account a2 on a2.`sub-type` = a.`sub-type`;
output :
|account |sub-type|single_v_multi|count_sub_type|
|--------|--------|--------------|--------------|
|account6|type1 |multiview |3 |
|account5|type1 |single |3 |
|account1|type1 |single |3 |
|account4|type2 |single |2 |
|account2|type2 |single |2 |
|account6|type3 |single |2 |
|account3|type3 |single |2 |
Best regards,

Related

Count string occurrences within a list column SQL/Grafana

I have a table in the following format:
| id | tags |
|----|-------------------------|
|1 |['Car', 'Plane', 'Truck']|
|2 |['Plane', 'Truck'] |
|3 |['Car', 'Plane'] |
|4 |['Plane'] |
|5 |['Boat', 'Truck'] |
How can I create a table that gives me the total number of occurrences of each item in all cells of the "tags" column? Items ideally do not include single quotes, but may if necessary.
The resulting table would look like:
| tag | count |
|-------|-------|
| Car | 2 |
| Plane | 4 |
| Truck | 3 |
| Boat | 1 |
The following does not work because it only counts identical "tags" entries rather than comparing list contents.
SELECT u.id, count(u.tags) as cnt
FROM table u
group by 1
order by cnt desc;
I am aware of this near-identical question, but they are using Snowflake/SQL whereas I am using MySQL/Grafana so the accepted answer uses functions unavailable to me.

How to design tables to allow for multi-field query on one row

I am very new to database design and am using MS Access to try achieve my task. I am trying to create a database design that will allow for the name and description of two items to be queried
on a single row of information. Here is the problem: certain items are converted to other particular items -
any item can have multiple conversions performed on it, and all conversions will have two (many) items involved.
In this sense, we have a many-to-many relationship which necessitates the use of an intermediate table. My
tables must be structured in a way that allows for me to, in one row, query the Item ID's and names
of which items were involved in conversions.
My current table layout is as follows:
Items
+--------+----------+------------------+--+
| ItemID*| ItemName | ItemDescription | |
+--------+----------+------------------+--+
| 1 | DESK | WOOD, 4 LEG | |
| 2 | SHELF | WOOD, SOLID BASE | |
| 3 | TABLE | WOOD, 4 LEG | |
+--------+----------+------------------+--+
ItemConversions
+------------------+--------------+
| ConversionID(CK) | Item1_ID(CK) |
+------------------+--------------+
| 1 | 2 |
| 2 | 2 |
| 3 | 1 |
+------------------+--------------+
Conversions
+---------------+----------+----------+
| ConversionID* | Item1_ID | Item2_ID |
+---------------+----------+----------+
| 1 | 2 | 1 |
| 2 | 2 | 3 |
| 3 | 1 | 3 |
+---------------+----------+----------+
What I want is for it to be possible to achieve the kind of query I described above, though I don't think
my current layout is going to work for this, since the tables are only being joined on Item1_ID. Any advice
would be appreciated, hopefully my tables are not too specific and this is easily understandable.
A sample query output might look like this:
+--------------+----------+----------+----------+----------+
| ConversionID | Item1_ID | ItemName | Item2_ID | ItemName |
+--------------+----------+----------+----------+----------+
| 1 | 2 | SHELF | 1 | DESK |
+--------------+----------+----------+----------+----------+
I got it working how I wanted to with the help of June7's suggestion - I didn't know you could add in tables
multiple times in the query design page (very useful!). As for the tables, I edited the layout so that I have only
Items and Conversions (I deleted ItemConversions). Using the AS sql command I was able to write a query that pulls
the data I want from the tables. The table and query layout can be seen below:
Items
+--------+----------+------------------+--+
| ItemID*| ItemName | ItemDescription | |
+--------+----------+------------------+--+
| 1 | DESK | WOOD, 4 LEG | |
| 2 | SHELF | WOOD, SOLID BASE | |
| 3 | TABLE | WOOD, 4 LEG | |
+--------+----------+------------------+--+
Conversions
+---------------+----------+----------+
| ConversionID* | Item1_ID | Item2_ID |
+---------------+----------+----------+
| 1 | 2 | 1 |
| 2 | 2 | 3 |
| 3 | 3 | 1 |
+---------------+----------+----------+
Query:
SELECT
Conversions.ConversionID,
Conversions.Item1_ID,
Conversions.Item2_ID,
Items.ItemName,
Items_1.ItemName,
FROM
(
Conversions
INNER JOIN
Items
ON Conversions.Item1_ID = Items.ItemID
)
INNER JOIN
Items AS Items_1
ON Conversions.Item2_ID = Items_1.ItemID;

SELECTing Related Rows Based on a Single Row Match

I have the following table running on Postgres SQL 9.5:
+---+------------+-------------+
|ID | trans_id | message |
+---+------------+-------------+
| 1 | 1234567 | abc123-ef |
| 2 | 1234567 | def234-gh |
| 3 | 1234567 | ghi567-ij |
| 4 | 8902345 | ced123-ef |
| 5 | 8902345 | def234-bz |
| 6 | 8902345 | ghi567-ij |
| 7 | 6789012 | abc123-ab |
| 8 | 6789012 | def234-cd |
| 9 | 6789012 | ghi567-ef |
|10 | 4567890 | abc123-ab |
|11 | 4567890 | gex890-aj |
|12 | 4567890 | ghi567-ef |
+---+------------+-------------+
I am looking for the rows for each trans_id based on a LIKE query, like this:
SELECT * FROM table
WHERE message LIKE '%def-234%'
This, of course, returns just three rows, the three that match my pattern in the message column. What I am looking for, instead, is all the rows matching that trans_id in groups of messages that match. That is, if a single row matches the pattern, get all the rows with the trans_id of that matching row.
That is, the results would be:
+---+------------+-------------+
|ID | trans_id | message |
+---+------------+-------------+
| 1 | 1234567 | abc123-ef |
| 2 | 1234567 | def234-gh |
| 3 | 1234567 | ghi567-ij |
| 4 | 8902345 | ced123-ef |
| 5 | 8902345 | def234-bz |
| 6 | 8902345 | ghi567-ij |
| 7 | 6789012 | abc123-ab |
| 8 | 6789012 | def234-cd |
| 9 | 6789012 | ghi567-ef |
+---+------------+-------------+
Notice rows 10, 11, and 12 were not SELECTed because there was not one of them that matched the %def-234% pattern.
I have tried (and failed) to write a sub-query to get the all the related rows when a single message matches a pattern:
SELECT sub.*
FROM (
SELECT DISTINCT trans_id FROM table WHERE message LIKE '%def-234%'
) sub
WHERE table.trans_id = sub.trans_id
I could easily do this with two queries, but the first query to get a list of matching trans_ids to include in a WHERE trans_id IN (<huge list of trans_ids>) clause would be very large, and would not be a very inefficient way of doing this, and I believe there exists a way to do it with a single query.
Thank you!
This will do the job I think :
WITH sub AS (
SELECT trans_id
FROM table
WHERE message LIKE '%def-234%'
)
SELECT *
FROM table JOIN sub USING (trans_id);
Hope this help.
Try this:
SELECT ID, trans_id, message
FROM (
SELECT ID, trans_id, message,
COUNT(*) FILTER (WHERE message LIKE '%def234%')
OVER (PARTITION BY trans_id) AS pattern_cnt
FROM mytable) AS t
WHERE pattern_cnt >= 1
Using a FILTER clause in the windowed version of COUNT function we can get the number of records matching the predefined pattern within each trans_id slice. The outer query uses this count to filter out irrelevant slices.
Demo here
You can do this.
WITH trans
AS
(SELECT DISTINCT trans_id
FROM t1
WHERE message LIKE '%def234%')
SELECT t1.*
FROM t1,
trans
WHERE t1.trans_id = trans.trans_id;
I think this will perform better. If you have enough data, you can do an explain on both Sub query and CTE and compare the output.

When Querying Many-To-Many Relationship in SQL, Return Multiple Connections As an Array In Single Row?

Basically, I have 3 tables, titles, providers, and provider_titles.
Let's say they look like this:
| title_id | title_name |
|------------|----------------|
| 1 | San Andres |
| 2 |Human Centipede |
| 3 | Zoolander 2 |
| 4 | Hot Pursuit |
| provider_id| provider_name |
|------------|----------------|
| 1 | Hulu |
| 2 | Netflix |
| 3 | Amazon_Prime |
| 4 | HBO_GO |
| provider_id| title_id |
|------------|----------------|
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 1 |
| 3 | 3 |
| 4 | 4 |
So, clearly there are titles with multiple providers, yeah? Typical many-to-many so far.
So what I'm doing to query it is with a JOIN like the following:
SELECT * FROM provider_title JOIN provider ON provider_title.provider_id = provider.provider_id JOIN title ON title.title_id = provider_title.title_id WHERE provider.name IN ('Netflix', 'HBO_GO', 'Hulu', 'Amazon_Prime')
Ok, now to the actual issue. I don't want repeated title names back, but I do want all of the providers associated with the title. Let me explain with another table. Here is what I am getting back with the current query, as is:
| provider_id| provider_name | title_id | title_name |
|------------|---------------|----------|---------------|
| 1 | Hulu | 1|San Andreas |
| 1 | Hulu | 2|Human Centipede|
| 2 | Netflix | 1|San Andreas |
| 3 | Amazon_Prime | 1|San Andreas |
| 3 | Amazon_prime | 3|Zoolander 2 |
| 4 | HBO_GO | 4|Hot Pursuit |
But what I really want would be something more like
| provider_id| provider_name |title_id| title_name|
|------------|-----------------------------|--------|-----------|
| [1, 2, 3] |[Hulu, Netflix, Amazon_Prime]| 1|San Andreas|
Meaning I only want distinct titles back, but I still want each title's associated providers. Is this only possible to do post-sql query with logic iterating through the returned rows?
Depending on your database engine, there may be an aggregation function to help achieve this.
For example, this SQLfiddle demonstrates the postgres array_agg function:
SELECT t.title_id,
t.title_name,
array_agg( p.provider_id ),
array_agg( p.provider_name )
FROM provider_title as pt
JOIN
provider as p
ON pt.provider_id = p.provider_id
JOIN title as t
ON t.title_id = pt.title_id
GROUP BY t.title_id,
t.title_name
Other database engines have equivalents. For example:
mySQL has group_concat
Oracle has listagg
sqlite has group_concat (as well!)
If your database isn't covered by the above, you can google '[Your database engine] aggregate comma delimited string'

SQL query for many-to-many self-join

I have a database table that has a companion many-to-many self-join table alongside it. The primary table is part and the other table is alternate_part (basically, alternate parts are identical to their main part with different #s). Every record in the alternate_part table is also in the part table. To illustrate:
`part`
| part_id | part_number | description |
|---------|-------------|-------------|
| 1 | 00001 | wheel |
| 2 | 00002 | tire |
| 3 | 00003 | window |
| 4 | 00004 | seat |
| 5 | 00005 | wheel |
| 6 | 00006 | tire |
| 7 | 00007 | window |
| 8 | 00008 | seat |
| 9 | 00009 | wheel |
| 10 | 00010 | tire |
| 11 | 00011 | window |
| 12 | 00012 | seat |
`alternate_part`
| main_part_id | alt_part_id |
|--------------|-------------|
| 1 | 5 | // Wheel
| 5 | 1 | // |
| 5 | 9 | // |
| 9 | 5 | // |
| 2 | 6 | // Tire
| 6 | 2 | // |
| ... | ... | // |
I am trying to produce a simple SQL query that will give me a list of all alternates for a main part. The tricky part is: some alternates are only listed as alternates of alternates, it is not guaranteed that every viable alternate for a part is listed as a direct alternate. e.g., if 'Part 3' is an alternate of 'Part 2' which is an alternate of 'Part 1', then Part 3 is an alternate of Part 1 (even if the alternate_part table doesn't list a direct link). The reverse is also true (Part 1 is an alternate of Part 3).
Basically, right now I'm pulling alternates and iterating through them
SELECT p.*, ap.*
FROM part p
INNER JOIN alternate_part ap ON p.part_id = ap.main_part_id
And then going back and doing the same again on those alternates. But, I think there's got to be a better way.
The SQL query I'm looking for will basically give me:
| part_id | alt_part_id |
|---------|-------------|
| 1 | 5 |
| 1 | 9 |
For part_id = 1, even when 1 & 9 are not explicitly linked in the alternates table.
Note: I have no control whatever over the structure of the DB, it is a distributed software solution.
Note 2: It is an Oracle platform, if that affects syntax.
You have to create hierarchical tree , probably you have to use connect by prior , nocycle query
something like this
select distinct p.part_id,p.part_number,p.description,c.main_part_id
from part p
left join (
select main_part_id,connect_by_root(main_part_id) real_part_id
from alternate_part
connect by NOCYCLE prior main_part_id = alternate_part_id
) c
on p.part_id = c.real_part_id and p.part_id != c.main_part_id
order by p.part_id
You can read full documentation about Hierarchical queries at http://docs.oracle.com/cd/B28359_01/server.111/b28286/queries003.htm