Is there a name for the SQL query pattern that gets the row data for each record in a GROUPed query? - sql

A question about how to get the data for each record that is the max in some GROUP appears over and over again on the net. There are many solutions, some of them easier to conceptualize than others. Does the query 'template' here have a name? I ask because one of the other patterns has the name 'correlated subquery' I believe. I have a need to issue this type of query often, and if I had names for the approaches I would I have a better mental index of the possible solutions to try.
Here's another example of the query type I want to know a name for.

I would call that an example of a query with a compound join.

Related

I'm being asked to create IN queries for different GUIDs...huh?

I'm a GIS intern.
I've been asked:
"Could you also create IN queries for the different sets of GUID’s? Here is an example:
"GlobalID" IN '{58BEE03F-1656-4BD5-B53D-B887E93A5287}', '{009C7364-8D77-46B3-A531-B60ED4E5B407}', '{0105263C-1305-4AB9-A00A-4BED01832177}')"
I'm not sure what that means or why I'd have to do it. What I can tell you is that I have several .shp that I have geocoded and then created global IDs for.
I've googled this for hours now and am no closer to understanding the request than I was. It could be that the answer is staring me in the face but I don't think I know enough to know that.
Thank you,
Kathy
In order to create and understand IN queries, first you'll have to understand the basics of a query. It sounds like this might not be something you're familiar with, so I'll start with that.
There are 3 main parts to a query, SELECT, FROM, and WHERE.
SELECT is the information (or columns) you want to return. You can SELECT * to select all columns or SELECT specificColumn1, specificColumn2 to select specific columns.
The next step is the FROM statement. From determines what table(s) you will be querying. You can query multiple tables here if you like and tables can also be aliased like so: FROM table1 t1.
The third statement is the WHERE statement, which specifies any conditions that the query is required to meet. In your case, this is where your IN statement will go. There are a ton of different keywords you can use here, but I'll just give a quick sample query for you (keep in mind I have no idea what your schema looks like).
SELECT *
FROM GUIDData
WHERE GlobalID IN ('{58BEE03F-1656-4BD5-B53D-B887E93A5287}', '{009C7364-8D77-46B3-A531-B60ED4E5B407}', '{0105263C-1305-4AB9-A00A-4BED01832177}');
So what this query will do, is it will give you all the data for each item in the GUIDData table with a global ID of {58BEE03F-1656-4BD5-B53D-B887E93A5287}, {009C7364-8D77-46B3-A531-B60ED4E5B407}, or {0105263C-1305-4AB9-A00A-4BED01832177}.
Did this help?

Pervasive SQL query for entire database

The question was already asked here,
Pervasive SQL query
but never answered.
Can somebody help to create a query that will search the entire database for a specific value?
Sorry, I can't comment on the previous question as I am a new user and don't have enough reputation to do so.
There is not a built-in way to search every single column for a specific value.
I'm not exactly sure why you want to search every single column for a specific value. Seems a little excessive in terms of the performance hit on the database.
If you really need to do this, the best suggestion I can give would be to write a stored procedure that iterates all of the tables, then iterates all of the fields in each table to use them in the WHERE clause. A better way to do it would be to build the query using the fields where the value is like to be. For example, if you're trying to search all the tables for a specific ID, you probably don't need to search date or currency or quantity fields. How you do this will also depend on the version of PSQL you're using.
If you explain what you hope to accomplish and why you need it, we might be able to offer better suggestions.

How to simulate ifs in a sql query that is not database server dependent?

Given the below table:
|idAsPrimaryKey|Id - it has a diff name, but it is easier like this|column A|
How can I select in a single sql query, not database server specific, something similar to:
List of results = null
for each different id:
if there is a row for this id that has for column A the value V1
ListOfResults add this found row
else
if there is a row for this id that has for column A the value V2
ListOfResults add this found row
else
add to ListOfResults the first row found for this id
Quite easy, since you don't seem to know anything about SQL, here's a "teach a man how to fish..." answer.
You have an amount of data and "only" a language how to get data, nothing to really "program". (Of course there are functions and procedures and so on, but those are used in other circumstances or the programmer makes things more complicated than necessary)
Because of this, you have to find a way, how to combine the data, sometimes even with itself, to get what you want. This blog post explains the basics of joins (that's how you combine tables or data from subqueries): A Visual Explanation of SQL Joins (for critics of this post, please read on...)
With this basic knowledge you should now try to create a query, where you join your table to itself two times. To choose the right value for your ListOfResults you then have to use the COALESCE() function. It returns the first of its parameters which isn't NULL.
Here comes the critic for the link I posted above. The Venn diagramms used in the first link don't represent how much data you get back from joining. For this to learn, read this answer here on SO: sql joins as venn diagram
Okay, now you learned, that you might get more data back than you might expect. And here comes another problem in your wording of your question. There's no "first" row in relational databases, you have to exactly describe which row you want, else the data you get back is actually worth nothing. You get random data. A solution for both problems is using GROUP BY and (important!) an appropriate aggregate function.
This should be enough info for you to solve the problem. Feel free to ask more questions if anything is unclear.

question about aggregate function internals in SQL/Postgres

How does a function like SUM work? If I execute
select id,sum(a) from mytable group by id
does it sort by id and then sum over each range of equal id's? I am no planner expert, but it looks like that is what is happening, where mytable is maybe a hundred million rows with a few million distinct id's.
Or does it just keep a hash of id -> current_sum, and then at each row either increments the value of id or add a new key? Isn't that far faster and less memory hungry?
SQL standards try to dictate external behavior, not internal behavior. In this particular case, a SQL implementation that conforms to (one of the many) standards is supposed to act like it does things in this order.
Build a working table from all the table constructors in the FROM clause. (There's only one in your example.)
In the GROUP BY clause, partition the working table into groups. Reduce each group to one row. Replace the working table with the grouped table.
Resolve the expressions in the SELECT clause.
Query optimizers that follow SQL standards are free to rearrange things however they like, as long as the result is the same as if it had followed those steps.
You can find more details in the answers and comments to this SO question.
So, I found this, http://helmingstay.blogspot.com/2009/06/postgresql-poetry-aggregate-median-with.html, which claims that it does indeed use the accumulator pattern. Hmmm.

How would you do give the user a preference for how from an SQL table is to be printed?

I'm given a task from a prospective employer which involves SQL tables. One requirement that they mentioned is that they want the name retrieved from a table called "Employees" to come in the form at of either "<LastName>, <FirstName>" OR "<FirstName> <MiddleName> <LastName> <Suffix>".
This appears confusing to me because this kind of sounds like they're asking me to make a function or something. I could probably do this in a programming language and have the information retrieved that way, but to do this in the SQL table exclusively is weird to me. Since I'm rather new to SQL and my familiarity with SQL doesn't exceed simple tasks such as creating databases, tables, fields, inserting data into fields, updating fields in records, deleting records in tables which meet a specific condition, and selecting fields from tables.
I hope that this isn't considered cheating since I mentioned that this was for a prospective employer, but if I was still in school then I could just outright ask a professor where I can find a clue for this or he would've outright told me in class. But, for a prospective job, I'm not sure who I would ask about any confusion. Thanks in advance for anyone's help.
A SQL query has a fixed column output: you can't change it. To achieve this. you could have a concatenate with a CASE statement to make it one varchar column, but then you need something (parameter) to switch the CASE.
So, this is presentation, not querying SQL.
I'd return all 4 columns mentioned and decide how I want them in the client.
Unless you have just been asked for 2 different queries on the same SQL table
You haven't specified the RDBMS, but in SQL Server you could accomplish this using Computed Columns.
Typically, you would use a View over the table..