Materialized view only for covering index - sql

Given the following table car_year of car model's year:
| id | maker | model | year |
----------------------------------
| 6 | Audi | Allroad | 2001 |
| 12 | Audi | A8 | 2008 |
| 14 | Ford | Mustang | 1996 |
| 15 | Honda | Civic | 2000 |
| 19 | Honda | Insight | 2000 |
| 22 | Ford | F150 | 2009 |
| 24 | Honda | Accord | 2000 |
| 28 | Ford | F150 | 2007 |
| 34 | Audi | S8 | 2002 |
| 48 | Ford | Expedition | 2011 |
| 62 | Ford | Escort | 2004 |
| 81 | Ford | Explorer | 2007 |
| 84 | Ford | Escape | 2006 |
| 93 | Honda | Accord | 1995 |
I would like to have a covering index for the "earliest model of a maker".
My solution is to create a materialized view:
CREATE MATERIALIZED VIEW earliest AS
SELECT DISTINCT ON(maker) maker, model
FROM car_year
ORDER BY maker, year
And then a covering index over it:
CREATE INDEX earliest_index ON earliest(maker) INCLUDE (model);
It works! But the materialized views is useless (for my usage) because I will only use the the covering index.
Am I missing a more elegant solution, or a (Postgre)SQL feature that I don't know about ?

If I get you right, you are asking for an “index organized table” (or index organized materialized view in that case), that is a table that is really an index. Then you wouldn't have to waste twice the storage, once for the table that you never use, and once for the index, on which you want to perform index-only scans.
The answer is that such “index organized tables” don't exist in PostgreSQL.

Related

How to visualize data from identical tables in a single chart in Looker Studio?

We have multiple production environments and we want to see data from those enviroments in one chart. Each env has identical tables with the same table schema, each with their corresponding data. We need an intuitive way to merge data from these environments and be able to show them in one chart.
Let's take an example here:
Table 1
| name | age |
| Sam | 42 |
| Mary | 19 |
Table 2
| name | age |
| Adam | 22 |
| James | 45 |
The outcome should be:
| name | age |
| Sam | 42 |
| Mary | 19 |
| Adam | 22 |
| James | 45 |
We would like to use this outcome data for our chart visualization.

Teradata SQL Assistant - How can I pivot or transpose large tables with many columns and many rows?

I am using Teradata SQL Assistant Version TD 16.10.06.01 ...
I have seen a lot people transpose data for set smallish tables but I am working on thousands of clients and need the break the columns up into Line Item Values to compare orders/highlight differences between orders. Problem is it is all horizontally linked and I need to transpose it to Id,Transaction id,Version and Line Item Value 1, Line Item Value 2... then another column comparing values to see if they changed.
example:
+----+------------+-----------+------------+----------------+--------+----------+----------+------+-------------+
| Id | First Name | Last Name | DOB | transaction id | Make | Location | Postcode | Year | Price |
+----+------------+-----------+------------+----------------+--------+----------+----------+------+-------------+
| 1 | John | Smith | 15/11/2001 | 1654654 | Audi | NSW | 2222 | 2019 | $ 10,000.00 |
| 2 | Mark | White | 11/02/2002 | 1661200 | BMW | WA | 8888 | 2016 | $ 8,999.00 |
| 3 | Bob | Grey | 10/05/2002 | 1667746 | Ford | QLD | 9999 | 2013 | $ 3,000.00 |
| 4 | Phil | Faux | 6/08/2002 | 1674292 | Holden | SA | 1111 | 2000 | $ 5,800.00 |
+----+------------+-----------+------------+----------------+--------+----------+----------+------+-------------+
hoping to change the data to :
+----+----------+----------+----------+----------------+----------+----------+----------------+---------+-----+
| id | trans_id | Vers_ord | Item Val | Ln_Itm_Dscrptn | Org_Val | Updt_Val | Amndd_Ord_chck | Lbl_Rnk | ... |
+----+----------+----------+----------+----------------+----------+----------+----------------+---------+-----+
| 1 | 1654654 | 2 | 11169 | Make | Audi BLK | Audi WHT | Yes | 1 | |
| 1 | 1654654 | 2 | 11189 | Location | NSW | WA | Yes | 2 | |
| 1 | 1654654 | 2 | 23689 | Postcode | 2222 | 6000 | Yes | 3 | |
+----+----------+----------+----------+----------------+----------+----------+----------------+---------+-----+
Recently with smaller data I created a table added in Values then used a case statement when value 1 then xyz with a product join ... and the data warehouse admins didn't mention anything out of order. but I only had row 16 by 200 column table to transpose ( Sum, Avg, Count, Median(function) x 4 subsets of clients) , which were significantly smaller than my current tables to make comparisons with.
I am worried my prior method will probably slow the data Warehouse down, plus take me significant amount of time to type the SQL.
Is there a better way to transpose large tables?

joining wide tables (10s of unique cols)

I have multiples tables which need to be joined on multiple common attributes such the different attributes can be shown in a single table.
table1
+--------+---------+-------+
| make | model | r_yr |
+--------+---------+-------+
| toyota | corolla | 1999 |
| toyota | camry | 2002 |
| toyota | qualis | 2004 |
| toyota | rav4 | 2006 |
+--------+---------+-------+
table2
+--------+---------+--------+
| make | model | kms |
+--------+---------+--------+
| toyota | corolla | 25000 |
| toyota | camry | 50000 |
+--------+---------+--------+
table4
+--------+---------+---------+
| make | model | mileage |
+--------+---------+---------+
| toyota | corolla | 20 |
| toyota | qualis | 25 |
+--------+---------+---------+
table5
+--------+----------+-------+
| make | model | colr |
+--------+----------+-------+
| toyota | camry | blue |
| toyota | rav4 | green |
+--------+----------+-------+
I'm doing the following to join the results
select a.make, a.model,a.r_yr,b.kms,c.mileage,d.colr
from table1 as a
left join table2 as b
on b.make=a.make and b.model=a.model and b.r_yr=a.r_yr
left join table3 as c
on c.make=a.make and c.model=a.model and c.r_yr=a.r_yr
left join table4 as d
on d.make=a.make and d.model=a.model and d.r_yr=a.r_yr
This gives a table like below
+--------+---------+-------+-------+----------+--------+
| make | model | r_yr | kms | mileage | colr |
+--------+---------+-------+-------+----------+--------+
| toyota | corolla | 1999 | 25000 | 20 | |
| toyota | camry | 2002 | 50000 | | blue |
| toyota | qualis | 2004 | | 25 | |
| toyota | rav4 | 2006 | | | green |
+--------+---------+-------+-------+----------+--------+
However the issue I have is that, for the real data set I'm working with, there are 5 common cols per table and around 20-40 unique attributes per table requiring to specify 20-40 col names in the query in the form of b.kms, ....,c.mileage, ......,d.colr,..... Is there a work around to not having to specify those unique columns by specifying all except the common cols or other ways ?
You cannot do something like SELECT all except x,y,z ...
But you can simplify this query using USING clause instead of JOIN ... ON
Demo: http://sqlfiddle.com/#!17/fa97a/6
select *
from table1 as a
left join table2 as b
USING (make, model)
left join table3 as c
USING (make, model)
left join table4 as d
USING (make, model)
| make | model | r_yr | kms | mileage | colr |
|--------|---------|------|--------|---------|--------|
| toyota | camry | 2002 | 50000 | (null) | blue |
| toyota | corolla | 1999 | 25000 | 20 | (null) |
| toyota | qualis | 2004 | (null) | 25 | (null) |
| toyota | rav4 | 2006 | (null) | (null) | green |
Note: In the above example I am using only two common columns (make, model) since in your example r_yr is not a common column because it is only in table1

Postgres: select n unique rows for ID

Using Postgres I have a scenario where I need to return a variable number of rows for a each unique id in a sql statement.
Consider I have a table of the cars a user has owned over the years.
+----+----------+---------+-------+
| ID | make | model | type |
+----+----------+---------+-------+
| 1 | toyota | camry | sedan |
| 1 | ford | mustang | coupe |
| 1 | toyota | celica | coupe |
| 1 | bmw | z4 | coupe |
| 1 | honda | accord | sedan |
| 2 | buick | marque | sedan |
| 2 | delorean | btf | coupe |
| 2 | mini | cooper | coupe |
| 3 | ford | f-150 | truck |
| 3 | ford | mustang | coupe |
| 1 | ford | taurus | sedan |
+--------+----------+-------+-----+
From this table I'd only want to return two rows for each user that has a coupe and ignore the rest.
So something like. I'd also like to preserve the empty columns so the second result for ID 3 would be empty because there is only one car of type coupe. I am also working with restrictions as this has to run AWS Reshift. So, I can't use many functions. It seems this would be easy using a Top statement like in SQL server, but with Redshift restrictions and my lack of knowledge I'm not sure of the best way.
+----+----------+---------+-------+
| ID | make | model | type |
+----+----------+---------+-------+
| 1 | ford | mustang | coupe |
| 1 | toyota | celica | coupe |
| 2 | delorean | btf | coupe |
| 2 | mini | cooper | coupe |
| 3 | ford | mustang | coupe |
| 3 | | | |
+--------+----------+-------+-----+
Thanks a lot for your help.
As far as I know, Redshift supports window functions:
select id, make, model, type
from (
select id, make, model, type,
row_number() over (partition by id order by make) as rn
from the_table
where type = 'coupe'
) t
where rn <= 2
order by id, make;

SQL Join with Group By

Ok, so i'm trying to write a complex query (at least complex to me) and need some pro help. This is my database setup:
Table: MakeList
| MakeListId | Make |
| 1 | Acura |
| 2 | Chevy |
| 3 | Pontiac |
| 4 | Scion |
| 5 | Toyota |
Table: CustomerMake
| CustomerMakeId | CustomerId | _Descriptor |
| 1 | 123 | Acura |
| 2 | 124 | Chevy |
| 3 | 125 | Pontiac |
| 4 | 126 | Scion |
| 5 | 127 | Toyota |
| 6 | 128 | Acura |
| 7 | 129 | Chevy |
| 8 | 130 | Pontiac |
| 9 | 131 | Scion |
| 10 | 132 | Toyota |
Table: Customer
| CustomerId | StatusId |
| 123 | 1 |
| 124 | 1 |
| 125 | 1 |
| 126 | 2 |
| 127 | 1 |
| 128 | 1 |
| 129 | 2 |
| 130 | 1 |
| 131 | 1 |
| 132 | 1 |
What i am trying to end up with is this...
Desired Result Set:
| Make | CustomerId|
| Acura | 123 |
| Chevy | 124 |
| Pontiac | 125 |
| Scion | 131 |
| Toyota | 127 |
I am wanting a list of unique Makes with one active (StatusId = 1) CustomerId to go with it. I'm assuming i'll have to do some GROUP BYs and JOINS but i haven't been able to figure it out. Any help would be greatly appreciated. Let me know if i haven't given enough info for my question. Thanks!
UPDATE: The script doesn't have to be performant - it will be used one time for testing purposes.
Something like this:
select cm._Descriptor,
min(cu.customerid)
from CustomerMake cm
join Customer cu on cuo.CustomerId = cm.CustomerId and cu.StatusId = 1
group by cm._Descriptor
I left out the MakeList table as it seems unnecessary because you are storing the full make name as _Descriptorin the CustomerMake table anyway (so the question is what is the MakeList table for? Why don't you store a FK to it in the CustomerMake table?)
You want to
(a) join the customer and customermake tables
(b) filter on customer.statusid
(c) group by customermake._descriptor
Depending on your RDBMS, you may need to explicitly apply a group function to customer.customerid to include it in the select list. Since you don't care which particular customerid is displayed, you could use MIN or MAX to just pick an essentially arbitrary value.