How to set precedence to specific attribute while maintaining order - sql

I want to select all the rows of carInventory table but have the brands which has a car in some sort of blue to be first. After, the cars in each brand sorts by year.
I was trying to alter the solution that was given in this post Take precedence on a specific value from a table
but I can't seem to keep the brands together after I figure out the rank
SELECT
*, RANK() over(
partition by colBrand
order by case when colColor like '%blue%' then 1 else 0 end
) RANK FROM inventoryTable Order By Rank, colBrand, colYear
Here's what the tables should look like. Starting Table
Brand
Make
Color
Year
Toyota
Corolla
Atlantis Blue
2015
Ford
Focus
Bayside Blue
2016
Porshe
Taycan
Grey
2019
Volkswagen
Taos
Blue
2015
Volkswagen
Jetta
White
2020
Ford
Focus
Aztec Red
2018
Search Result
Brand
Make
Color
Year
Ford
Focus
Aztec Red
2018
Ford
Focus
Bayside Blue
2016
Toyota
Corolla
Atlantis Blue
2015
Volkswagen
Taos
Blue
2020
Volkswagen
Jetta
White
2015
Porshe
Taycan
Grey
2019

Firstly, the auto industry uses the terms Make and Model to refer to what you call Brand and Make. Using your own terminology will be confusing to many.
I think if you write your order logic more precisely and completely, you will more easily find a path to a solution. And it helps to be consistent. What is "ColBrand"? ColYear? IMO you do everyone a disservice by prefacing column names with a redundant prefix. And here "col" refers to "column"? Just don't!
So it seems you want to sort by <brands with blue vehicles / brands without>, brand, year descending. Notice that Make is not included in your ordering. And notice that your Taos has year 2015 in the table but swaps that year for another in the desired output. This is one reason why you should post a script - helps to avoid typos like that and encourages others to help you.
So here is another way to accomplish that. The CTE selects all brands that have blue colors. You simply outer join the actual table to the CTE to know if the brand satisfies the blue condition. Use that knowledge in the CASE expression in the ORDER BY clause.
with blubr as (select distinct brand from #inventory where color like '%blue%')
select inv.brand, inv.make, inv.color, inv.year
from #inventory as inv left join blubr on inv.brand = blubr.brand
order by case when blubr.brand is not null then 0 else 1 end,
inv.brand, inv.year desc
;
fiddle to demonstrate Note there is a flaw in the logic of the prior answer. I've added a row to illustrate it. Do you see it? It is an easy fix to that query. Is this query better? Not really but it hopefully helps you think of different approaches to achieving the same goal.

Interesting, maybe below query can do what you need
I also made a demo fiddle
select *
from inventoryTable i
order by
case when exists ( select 1 from inventoryTable t
where t.colbrand=i.colbrand and t.colcolor like '%blue%') then 9999
else colYear end desc,colBrand asc, colyear desc

Related

SSRS query and WHERE with multiple

Being new with SQL and SSRS and can do many things already, but I think I must be missing some basics and therefore bang my head on the wall all the time.
A report that is almost working, needs to have more results in it, based on conditions.
My working query so far is like this:
SELECT projects.project_number, project_phases.project_phase_id, project_phases.project_phase_number, project_phases.project_phase_header, project_phase_expensegroups.projectphase_expense_total, invoicerows.invoicerow_total
FROM projects INNER JOIN
project_phases ON projects.project_id = project_phases.project_id
LEFT OUTER JOIN
project_phase_expensegroups ON project_phases.project_phase_id = project_phase_expensegroups.project_phase_id
LEFT OUTER JOIN
invoicerows ON project_phases.project_phase_id = invoicerows.project_phase_id
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total >0 )
The parameter is for selectionlist that is used to choose a project to the report.
How to have also records that have
( project_phase_expensegroups.projectphase_expense_total ) with value 0 but there might be invoices for that project phase?
Tried already to add another condition like this:
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total > 0 )
OR
( invoicerows.invoicerow_total > 0 )
but while it gives some results - also the one with projectphase_expense_total with value 0, but the report is total mess.
So my question is: what am I doing wrong here?
There is a core problem with your query in that you are left joining to two tables, implying that rows may not exist, but then putting conditions on those tables, which will eliminate NULLs. That means your query is internally inconsistent as is.
The next problem is that you're joining two tables to project_phases that both may have multiple rows. Since these data are not related to each other (as proven by the fact that you have no join condition between project_phase_expensegroups and invoicerows, your query is not going to work correctly. For example, given a list of people, a list of those people's favorite foods, and a list of their favorite colors like so:
People
Person
------
Joe
Mary
FavoriteFoods
Person Food
------ ---------
Joe Broccoli
Joe Bananas
Mary Chocolate
Mary Cake
FavoriteColors
Person Color
------ ----------
Joe Red
Joe Blue
Mary Periwinkle
Mary Fuchsia
When you join these with links between Person <-> Food and Person <-> Color, you'll get a result like this:
Person Food Color
------ --------- ----------
Joe Broccoli Red
Joe Bananas Red
Joe Broccoli Blue
Joe Bananas Blue
Mary Chocolate Periwinkle
Mary Chocolate Fuchsia
Mary Cake Periwinkle
Mary Cake Fuchsia
This is essentially a cross-join, also known as a Cartesian product, between the Foods and the Colors, because they have a many-to-one relationship with each person, but no relationship with each other.
There are a few ways to deal with this in the report.
Create ExpenseGroup and InvoiceRow subreports, that are called from the main report by a combination of project_id and project_phase_id parameters.
Summarize one or the other set of data into a single value. For example, you could sum the invoice rows. Or, you could concatenate the expense groups into a single string separated by commas.
Some notes:
Please, please format your query before posting it in a question. It is almost impossible to read when not formatted. It seems pretty clear that you're using a GUI to create the query, but do us the favor of not having to format it ourselves just to help you
While formatting, please use aliases, Don't use full table names. It just makes the query that much harder to understand.
You need an extra parentheses in your where clause in order to get the logic right.
WHERE ( projects.project_number = #iProjectNumber )
AND (
(project_phase_expensegroups.projectphase_expense_total > 0)
OR
(invoicerows.invoicerow_total > 0)
)
Also, you're using a column in your WHERE clause from a table that is left joined without checking for NULLs. That basically makes it a (slow) inner join. If you want to include rows that don't match from that table you also need to check for NULL. Any other comparison besides IS NULL will always be false for NULL values. See this page for more information about SQL's three value predicate logic: http://www.firstsql.com/idefend3.htm
To keep your LEFT JOINs working as you intended you would need to do this:
WHERE ( projects.project_number = #iProjectNumber )
AND (
project_phase_expensegroups.projectphase_expense_total > 0
OR project_phase_expensegroups.project_phase_id IS NULL
OR invoicerows.invoicerow_total > 0
OR invoicerows.project_phase_id IS NULL
)
I found the solution and it was kind easy after all. I changed the only the second LEFT OUTER JOIN to INNER JOIN and left away condition where the query got only results over zero. Also I used SELECT DISTINCT
Now my report is working perfectly.

How to tell in a Query I don't want duplicates?

So I got this query and it's pulling from tables like this:
Plantation TABLE
PLANT ID, Color Description
1 Red
2 Green
3 Purple
Vegetable Table
VegetabkeID, PLANT ID, Feeldesc
199 1 Harsh
200 1 Sticky
201 2 Bitter
202 3 Bland
and now in my Query I join them using PLANT ID ( I Use a left join)
PLANT ID, Color Description, Feeldesc
1 Red Harsh
1 Red Sticky
2 Green Bitter
3 Purple Bland
So the problem is that in the Query you can see Red shows up twice! I can't have this, and
I'm not sure how to make the joins happen but stop reds from coming up twice.
It seems remotely possible that you're asking how do to group indication -- that is, showing a value which identifies or describes a group only on the first line of that group. In that case, you want to use the lag() window function.
Assuming setup of the schema and data is like this:
create table plant (plantId int not null primary key, color text not null);
create table vegetable (vegetableId int not null, plantId int not null,
Feeldesc text not null, primary key (vegetableId, plantId));
insert into plant values (1,'Red'),(2,'Green'),(3,'Purple');
insert into vegetable values (199,1,'Harsh'),(200,1,'Sticky'),
(201,2,'Bitter'),(202,3,'Bland');
The results you show (modulus column headings) could be obtained with this simple query:
select p.plantId, p.color, v.Feeldesc
from plant p left join vegetable v using (plantId)
order by plantId, vegetableId;
If you're looking to suppress display of the repeated information after the first line, this query will do it:
select
case when plantId = lag(plantId) over w then null
else plantId end as plantId,
case when p.color = lag(p.color) over w then null
else p.color end as color,
v.Feeldesc
from plant p left join vegetable v using (plantId)
window w as (partition by plantId order by vegetableId);
The results look like this:
plantid | color | feeldesc
---------+--------+----------
1 | Red | Harsh
| | Sticky
2 | Green | Bitter
3 | Purple | Bland
(4 rows)
I had to do something like the above just this week to produce a listing directly out of psql which was easy for the end user to read; otherwise it never would have occurred to me that you might be asking about this functionality. Hopefully this answers your question, although I might be completely off base.
Check array_agg function in the documentation it can be used something like this:
SELECT
v.plantId
,v.color
,array_to_string(array_agg(v.Feeldesc),', ')
FROM
vegetable
INNER JOIN plant USING (plantId)
GROUP BY
v.plantId
,v.color
or use
SELECT DISTINCT
v.plantId
,v.color
FROM
vegetable
INNER JOIN plant USING (plantId)
disclaimer: hand written, syntax errors expected :)

SELECT datafields with multiple groups and sums

I cant seem to group by multiple data fields and sum a particular grouped column.
I want to group Person to customer and then group customer to price and then sum price. The person with the highest combined sum(price) should be listed in ascending order.
Example:
table customer
-----------
customer | common_id
green 2
blue 2
orange 1
table invoice
----------
person | price | common_id
bob 2330 1
greg 360 2
greg 170 2
SELECT DISTINCT
min(person) As person,min(customer) AS customer, sum(price) as price
FROM invoice a LEFT JOIN customer b ON a.common_id = b.common_id
GROUP BY customer,price
ORDER BY person
The results I desire are:
**BOB:**
Orange, $2230
**GREG:**
green, $360
blue,$170
The colors are the customer, that GREG and Bob handle. Each color has a price.
There are two issues that I can see. One is a bit picky, and one is quite fundamental.
Presentation of data in SQL
SQL returns tabular data sets. It's not able to return sub-sets with headings, looking something a Pivot Table.
The means that this is not possible...
**BOB:**
Orange, $2230
**GREG:**
green, $360
blue, $170
But that this is possible...
Bob, Orange, $2230
Greg, Green, $360
Greg, Blue, $170
Relating data
I can visually see how you relate the data together...
table customer table invoice
-------------- -------------
customer | common_id person | price |common_id
green 2 greg 360 2
blue 2 greg 170 2
orange 1 bob 2330 1
But SQL doesn't have any implied ordering. Things can only be related if an expression can state that they are related. For example, the following is equally possible...
table customer table invoice
-------------- -------------
customer | common_id person | price |common_id
green 2 greg 170 2 \ These two have
blue 2 greg 360 2 / been swapped
orange 1 bob 2330 1
This means that you need rules (and likely additional fields) that explicitly state which customer record matches which invoice record, especially when there are multiples in both with the same common_id.
An example of a rule could be, the lowest price always matches with the first customer alphabetically. But then, what happens if you have three records in customer for common_id = 2, but only two records in invoice for common_id = 2? Or do the number of records always match, and do you enforce that?
Most likely you need an extra piece (or pieces) of information to know which records relate to each other.
you should group by using all your selected fields except sum then maybe the function group_concat (mysql) can help you in concatenating resulting rows of the group clause
Im not sure how you could possibly do this. Greg has 2 colors, AND 2 prices, how do you determine which goes with which?
Greg Blue 170 or Greg Blue 360 ???? or attaching the Green to either price?
I think the colors need to have unique identofiers, seperate from the person unique identofiers.
Just a thought.

Beginner SQL question: querying gold and silver tag badges in Stack Exchange Data Explorer

I'm using the Stack Exchange Data Explorer to learn SQL, but I think the fundamentals of the question is applicable to other databases.
I'm trying to query the Badges table, which according to Stexdex (that's what I'm going to call it from now on) has the following schema:
Badges
Id
UserId
Name
Date
This works well for badges like [Epic] and [Legendary] which have unique names, but the silver and gold tag-specific badges seems to be mixed in together by having the same exact name.
Here's an example query I wrote for [mysql] tag:
SELECT
UserId as [User Link],
Date
FROM
Badges
Where
Name = 'mysql'
Order By
Date ASC
The (slightly annotated) output is: as seen on stexdex:
User Link Date
--------------- ------------------- // all for silver except where noted
Bill Karwin 2009-02-20 11:00:25
Quassnoi 2009-06-01 10:00:16
Greg 2009-10-22 10:00:25
Quassnoi 2009-10-31 10:00:24 // for gold
Bill Karwin 2009-11-23 11:00:30 // for gold
cletus 2010-01-01 11:00:23
OMG Ponies 2010-01-03 11:00:48
Pascal MARTIN 2010-02-17 11:00:29
Mark Byers 2010-04-07 10:00:35
Daniel Vassallo 2010-05-14 10:00:38
This is consistent with the current list of silver and gold earners at the moment of this writing, but to speak in more timeless terms, as of the end of May 2010 only 2 users have earned the gold [mysql] tag: Quassnoi and Bill Karwin, as evidenced in the above result by their names being the only ones that appear twice.
So this is the way I understand it:
The first time an Id appears (in chronological order) is for the silver badge
The second time is for the gold
Now, the above result mixes the silver and gold entries together. My questions are:
Is this a typical design, or are there much friendlier schema/normalization/whatever you call it?
In the current design, how would you query the silver and gold badges separately?
GROUP BY Id and picking the min/max or first/second by the Date somehow?
How can you write a query that lists all the silver badges first then all the gold badges next?
Imagine also that the "real" query may be more complicated, i.e. not just listing by date.
How would you write it so that it doesn't have too many repetition between the silver and gold subqueries?
Is it perhaps more typical to do two totally separate queries instead?
What is this idiom called? A row "partitioning" query to put them into "buckets" or something?
Requirement clarification
Originally I wanted the following output, essentially:
User Link Date
--------------- -------------------
Bill Karwin 2009-02-20 11:00:25 // result of query for silver
Quassnoi 2009-06-01 10:00:16 // :
Greg 2009-10-22 10:00:25 // :
cletus 2010-01-01 11:00:23 // :
OMG Ponies 2010-01-03 11:00:48 // :
Pascal MARTIN 2010-02-17 11:00:29 // :
Mark Byers 2010-04-07 10:00:35 // :
Daniel Vassallo 2010-05-14 10:00:38 // :
------- maybe some sort of row separator here? can SQL do this? -------
Quassnoi 2009-10-31 10:00:24 // result of query for gold
Bill Karwin 2009-11-23 11:00:30 // :
But the answers so far with a separate column for silver and gold is also great, so feel free to pursue that angle as well. I'm still curious how you'd do the above, though.
Is this a typical design, or are there much friendlier schema/normalization/whatever you call it?
Sure, you could add a type code to make it more explicit. But when you consider that one can not get a gold badge before a silver one, the date stamp makes a lot of sense to differentiate between them.
In the current design, how would you query the silver and gold badges separately? GROUP BY Id and picking the min/max or first/second by the Date somehow?
Yes - joining onto a derived table (AKA inline view) that is a list of users & the minimum date would return the silver badges. Using HAVING COUNT(*) >= 1 would work too. You'd have to use a combination of GROUP BY and HAVING COUNT(*) = 2` to get gold badges - the max date doesn't ensure that there are more than one record for a userid...
How can you write a query that lists all the silver badges first then all the gold badges next?
Sorry - by users, or all silvers first and then golds? The former might be done simply by using ORDER BY t.userid, t.date; the latter I'd likely use analytic functions (IE: ROW_NUMBER(), RANK())...
Is it perhaps more typical to do two totally separate queries instead?
See above about how vague your requirements are, to me anyways...
What is this idiom called? A row "partitioning" query to put them into "buckets" or something?
What you're asking about is referred to by the following synonyms: Analytic, Windowing, ranking...
You'd do something like this and rely only on date or count in an aggregate.
Arguably, it also makes no sense to query silver followed by gold, but rather get data side by side like this:
Unfortunately, you haven't really specified what you want, but a good starting point for aggregates is to express it in plain English
Example: "Give me dates of silver and gold badge awards per user for tag mysql". Which this does:
SELECT
UserId as [User Link],
min(Date) as [Silver Date],
case when count(*) = 1 THEN NULL ELSE max(date) END
FROM
Badges
Where
Name = 'mysql'
group by
UserId
Order By
case when count(*) = 1 THEN NULL ELSE max(date) END DESC, min(Date)
Edit, after update:
Your desired output is not really SQL: it's 2 separate recordsets. The separator is a no-go. As a setb based operation, there is no "natural" order so this introduces one:
SELECT
UserId as [User Link],
min(Date) as [Date],
0 as dummyorder
FROM
Badges
Where
Name = 'mysql'
group by
UserId
union all
select
UserId as [User Link],
max(Date) as [Date],
1 as dummyorder
FROM
Badges
Where
Name = 'mysql'
group by
UserId
having
count(*) = 2
Order By
dummyorder, Date

SQL: finding double entries without losing the ID

I have 1 table "Products" that looks like this:
ID Product Shop Color
01 Car A Black
02 Car B Black
03 Bike C Red
04 Plane A Silver
05 Car C Black
06 Bike A Red
In this example, a Product always has the same color, independent from the Shop where it is sold.
I want to make a query, that returns a distinct set of products, with the Color property. I also will need to have an ID, it could be any ID, that allows me to do a follow up query.
The result of the query should be:
ID Product Color
01 Car Black
03 Bike Red
04 Plane Silver
I tried:
SELECT DISTINCT
Product, Color
FROM
Products
But that obviously doesn't return the ID as well
I guess I need to join something, but my knowledge of SQL is too poor. I hope this is something simple.
This would be one way of getting the result you want:
SELECT min(ID), Product, Color FROM table GROUP BY Product, Color;
How About
SELECT
Product, Color, Min(ID)
FROM
TABLE
GROUP BY
Product, Colour
That'll return unique Product/Color Combinations and the first (lowest) ID found.
You need to use the GROUP BY clause.
The same but obtaining the maximun ID:
SELECT MAX(ID) AS ID, Product, Color
FROM Products
GROUP BY Product, Color
ORDER BY ID